WO2004085650 



Publication Title: 

A HIGH-TROUGHPUT DIAGNOSTIC ASSAY FOR THE HUMAN VIRUS 
CAUSING SEVERE ACUTE RESPIRATORY SYNDROME (SARS) 



Abstract: 

Abstract of WO2004085650 

The present invention relates to a high-throughput diagnostic assay for the virus 
causing Severe Acute Respiratory Syndrome (SARS) in humans ("hSARS 
virus"). In particular, the invention relates to a high-throughput reverse 
transcription-PCR diagnostic test for SARS associated coronavirus (SARS-CoV). 
The present assay is a rapid, reliable assay which can be used for diagnosis and 
monitoring the spread of SARS and is based on the nucleotide sequences of the 
N (nucleocapsid)-gene of the hSARS virus. The present method eliminates false 
negative results and provides increased sensitivity for the assay. The invention 
also discloses the S (spike)-gene of the hSARS virus. The invention further 
relates to the deduced amino acid sequences of the N-gene and S-gene 
products of the hSARS virus and to the use of the N-gene and S-gene products 
in diagnostic methods. The invention further encompasses diagnostic assays and 
kits comprising antibodies generated against the N-gene or S-gene product. Data 
supplied from the esp@cenet database - Worldwide 



Courtesy of http://v3.espacenet.com 



This Patent PDF Generated by Patent Fetcher(TM), a service of Stroke of Color, Inc. 



(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property 
Organization 

International Bureau 

(43) International Publication Date 
7 October 2004 (07.10.2004) 




PCT 



IIH 

(10) International Publication Number 

WO 2004/085650 Al 



(51) International Patent Classification 7 : C12N 15/50, 

C07K 14/165, A61K 39/215, G01N 33/569, A61P 31/14, 
11/00 

(21) International Application Number: 

PCT/CN2004/000246 

(22) International Filing Date: 24 March 2004 (24.03.2004) 
(25) Filing Language: English 



(26) Publication Language: 



(30) Priority Data: 



English 



60/457,031 


24 March 2003 (24.03.2003) 


US 


60/457,730 


26 March 2003 (26.03.2003) 


us 


60/459,931 


2 April 2003 (02.04.2003) 


us 


60/460,357 


3 April 2003 (03.04.2003) 


us 


60/461,265 


8 April 2003 (08.04.2003) 


us 


60/462,805 


14 April 2003 (14.04.2003) 


us 


60/464,886 


23 April 2003 (23.04.2003) 


us 


60/465,738 


25 April 2003 (25.04.2003) 


us 


60/470,935 


14 May 2003 (14.05.2003) 


us 



(71) Applicant (for all designated States except US): THE 
UNIVERSITY OF HONG KONG [CN/CN]; G18, Eliot 
Hall, Pokfulam Road, Hong Kong (CN). 

(72) Inventors: CHAN, Kwokhung; 3A, Wisteria Mansion, 
Taikoo Shing, Quarry Bay, Hong Kong (CN). GUAN, Yi; 
11 A, Block 4, Pokfulam Garden, 180 Pokfulam Road, 
Hong Kong (CN). NICHOLLS, John, Malcolm; 5B Roy- 
alton, 118 Pokfulam Road, Pokfulam, Hong Kong (CN). 
PEIRIS, Joseph, Sriyal, Malik; 19/E, Block 29, Baguio 
Villa, 550 Victoria Road, Hong Kong (CN). POON, 



Litman; Flat 21H, Block 2, PHase 3, Belvedere Garden, 
Tsuen Wan, Hong Kong (CN). YUEN, Kwokyung; Flat 
C, 19/F, Block 20, Baguio Villa, 555 Victoria Road,, Hong 
Kong (CN). LEUNG, Frederick, C; 24th Floor, Starr 
Hall, 9 IB Pokfulam Road, Hong Kong (CN). 

(74) Agent: CHINA PATENT AGENT (H.K) LTD.; 22/F, 
Great Eagle Centre, 23 Harbour Road, Wanchai, Hong 
Kong (CN). 

(81) Designated States ( unless otherwise indicated, for every 
kind of national protection available): AE, AG, AL, AM, 
AT, AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN, 
CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FT, 
GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, 
KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, 
MG, MK, MN, MW, MX, MZ, NA, NI, NO, NZ, OM, PG, 
PH, PL, PT, RO, RU, SC, SD, SE, SG, SK, SL, SY, TJ, TM, 
TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, YU, ZA, ZM, 
ZW. 

(84) Designated States (unless otherwise indicated, for every 
kind of regional protection available): ARIPO (BW, GH, 
GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), Euro- 
pean (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FT, FR, 
GB, GR, HU, IE, IT, LU, MC, NL, PL, PT, RO, SE, SI, SK, 
TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, 
ML, MR, NE, SN, TD, TG). 

Published: 

— with international search report 

For two -letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



IT) 
00 



O 
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(57) Abstract: The present invention relates to a high-throughput diagnostic assay for the virus causing Severe Acute Respiratory 
Syndrome (SARS) in humans ("hSARS virus"). In particular, the invention relates to a high -throughput reverse transcription-PCR 
diagnostic test for SARS associated coronavirus (SARS-CoV). The present assay is a rapid, reliable assay which can be used for 
diagnosis and monitoring the spread of SARS and is based on the nucleotide sequences of the N (nucleocapsid)-gene of the hSARS 
virus. The present method eliminates false negative results and provides increased sensitivity for the assay. The invention also 
discloses the S (spike)-gene of the hSARS virus. The invention further relates to the deduced amino acid sequences of the N-gene 
and S-gene products of the hSARS virus and to the use of the N-gene and S-gene products in diagnostic methods. The invention 
further encompasses diagnostic assays and kits comprising antibodies generated against the N-gene or S-gene product. 
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A HIGH-THROUGHPUT DIAGNOSTIC ASSAY FOR 
THE HUMAN VIRUS CAUSING 

SEVERE ACUTE RESPIRATORY SYNDROME 

(SARS) 



This application claims priority benefit to U.S. provisional application no. 
60/457,03 1, filed March 24, 2003; U.S. provisional application no. 60/457,730, filed March 
5 26, 2003; U.S. provisional application no. 60/459,931, filed April 2, 2003; U.S. provisional 
application no. 60/460,357, filed April 3, 2003; U.S. provisional application no. 60/461,265, 
filed April 8, 2003; U.S. provisional application no. 60/462,805, filed April 14, 2003; U.S. 
provisional application no. 60/464,886 filed April 23, 2003, U.S. provisional application no. 
60/465,738, filed April 25, 2003; and U.S. provisional application no. 60/470,935, filed 
10 May 14, 2003, each of which is incorporated herein by reference in its entirety. 

The instant application contains a lengthy Sequence listing which is being 
concurrently submitted via triplicate CD-R in lieu of a printed paper copy, and is hereby 
inconrporated by reference in its entirety. Said CD-R, recorded on March 22, 2004, are 
labeled "CRF", "Copy 1" and "Copy 2", respectively, and each contains only identical 1.6 
15 MB file (V9661077.APP). 

1. INTRODUCTION 

The present invention relates to a high-throughput diagnostic assay for the virus 
causing Severe Acute Respiratory Syndrome (SARS) in humans ("hSARS virus"). In 

20 particular, the invention relates to a high-throughput reverse transcription-PCR diagnostic 
test for SARS associated coronavirus (SARS-CoV). The present assay is a rapid, reliable 
assay which may be used for diagnosis and monitoring the spread of SARS. The present 
method eliminates false negative results and provides increased sensitivity for the assay. 
The invention further relates to nucleotide sequences and portions thereof, useful for the 

25 diagnosis of SARS. The invention further relates to nucleotide sequences and portions 
thereof, useful for assessing genetic diversity of SARS. The nucleotide sequences of the 
present invention comprise the (Nucleocapsid) N-gene and the S-gene sequences of the 
hSARS virus. The invention relates to a diagnostic kit that comprises nucleic acid 



WO 2004/085650 



PCT/CN2004/000246 



molecules for the detection of the N-gene or S-gene of hSARS virus. The invention also 
relates to the deduced amino acid sequences of the N-gene and S-gene products of the 
hSARS virus. The invention further relates to the use of the N-gene and S-gene products in 
diagnostic methods. The invention further encompasses diagnostic assays and kits 
5 comprising antibodies generated against the N-gene or S-gene product. 

2. BACKGROUND OF THE INVENTION 

Recently, there has been an outbreak of atypical pneumonia in Guangdong province 
in mainland China. Between November 2002 and March 2003, there were 792 reported 

10 cases with 3 1 fatalities (WHO. Severe Acute Respiratory Syndrome (SARS) Weekly 
Epidemiol Rec. 2003; 78: 86). Patients with SARS show various clinical symptoms, 
including fever (of 3 8 degrees Celsius or above for over 24 hours), malaise, chills, headache 
and body ache. Chest X-rays show changes compatible with pneumonia. Other symptoms 
include coughing, shortness of breath or difficulty in breathing. By 3rd May 2003, a 

15 cumulative total number of 1621 cases and 179 deaths had been occurred in Hong Kong, 
which contributed to 26% and 41% of the global reported cases (6234) and deaths (435) 
respectively. As the disease is highly contagious and spreads in daily-life activities, it is 
important to develop a rapid and reliable diagnosis test to monitor and control the disease. 
In response to this crisis, the Hospital Authority in Hong Kong has increased the 

20 surveillance on patients with severe atypical pneumonia. In the course of this investigation, 
a number of clusters of health care workers with the disease were identified. In addition, 
there were clusters of pneumonia incidents among persons in close contact with those 
infected. The disease was unusual in its severity and its progression in spite of the 
antibiotic treatment typical for the bacterial pathogens that are known to be commonly 

25 associated with atypical pneumonia. The present inventors were one of the groups involved 
in the investigation of these patients. All tests for identifying commonly recognized viruses 
and bacteria were negative in these patients. Furthermore, diagnostic tests for the detection 
of other genes in the hS ARS virus, such as the lb-gene are not useful to accurately diagnose 
SARS. The disease was given the acronym Severe Acute Respiratory Syndrome ("SARS")- 

30 This virus mutates and changes rapidly and hence the diagnostic of SARS was extremely 
difficult until the isolation of particular regions of the virus, the N-gene and S-gene, of the 
hS ARS virus from the SARS patients by the present inventors as disclosed herein. Namely, 
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the present invention discloses a diagnostic assay using particular regions in the genome of 
the virus for rapid, accurate, reliable and specific identification of the hSARS virus. The 
invention is useful in both clinical and scientific research applications. Furthermore, the 
present invention provides a high-throughput assay which can be used as a sensitive method 
5 for diagnosis and monitoring the spread of the SAKS. 

3. SUMMARY OF INVENTION 

The present invention is based upon the inventors' identification of a specific region 
of the hSARS virus, specifically, the 3 'region of the hSARS viral genome, and in particular, 

10 the (nucleocapsid) N-gene of the hSARS virus that may be used in diagnostic assay to 

detect hSARS. In particular, the N-gene is useful for the diagnosis of SARS because the N- 
gene has the most abundant copy number during viral infection compared to any other gene 
in the hS ARS virus, especially when the cells are lysed. Thus, the nucleic acid sequences of 
the N-gene of the hSARS virus are particularly useful in a rapid and reliable diagnostic 

15 assay for the hSARS virus. Furthermore, the present method eliminates false negative 
results and increases the sensitivity of the assay. 

The hS ARS virus was isolated from patients suffering from SARS in the recent 
outbreak of severe atypical pneumonia in China. The isolated virus is an enveloped, single- 
stranded RNA virus of positive polarity which belongs to the order, Nidovirales, of the 

20 family, Coronaviridae. The hS ARS virus is a very large RNA virus consisting of 

approximately 29,742 nucleotides. The complete genomic sequence of the hSARS virus 
was deposited in Genbank, NCBI with Accession No: AY278491 (SEQ ID NO: 15), which 
is incorporated by reference. The isolated hSARS virus was deposited with China Center 
for Type Culture Collection (CCTCC) on April 2, 2003 and accorded an accession number, 

25 CCTCC- V2003 03, as described in Section 7, infra, which is incorporated by reference. 
Also, the entire genome sequence of the hSARS virus, CCTCC- V2003 03, and 
characterization thereof are disclosed in a United States Patent Application with Attorney 
Docket No. V966 1.0069 filed concurrently herewith on March 24, 2004, which is 
incorporated by reference in its entirety. The virus mutates and changes rapidly and hence 

30 making the diagnostic of SARS very difficult. The present inventors have designed a 

diagnostic assay for detecting the presence of N-gene nucleic acid sequence or protein to 
rapidly, accurately, and specifically identify the hSARS virus. Furthermore, the present 
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inventors have designed a diagnostic assay for detecting the presence of S -gene nucleic acid 
sequence or protein to determine the genetic diversity of the hSARS virus. Accordingly, the 
invention relates to methods of detecting nucleotide sequences of the N-gene and S-gene of 
the hSARS virus. 

5 The present invention provides a rapid, reliable assay for the detection of hSARS 

virus. In preferred embodiment, the detection of hSARS virus includes the use of the 
nucleic acid molecules of the present invention in a polymerase chain reaction, Reverse 
transcription-Polymerase Chain Reaction (RT-PCR), Southern analysis, Northern analysis, 
or other nucleic acid hybridization for the detection of hSARS nucleic acids. In one 

10 embodiment, the invention provides methods for detecting the presence, activity or 
expression of the hSARS virus of the invention in a biological material, such as cells, 
nasopharyngeal aspirate, sputum, blood, saliva, urine, stool and so forth. In preferred 
embodiments, the biological material is nasopharyngeal aspirate or stool. The increased or 
decreased activity or expression of the hS ARS virus in a sample relative to a control sample 

15 can be determined by contacting the biological material with an agent which can detect 

directly or indirectly the presence, activity or expression of the hSARS virus. In a specific 
embodiment, the detecting agents are the nucleic acid molecules of the present invention. 

The present invention also relates to a method for identifying a subject infected with 
the hSARS virus, said method comprising obtaining total RNA from a biological sample 

20 obtained from the subject; reverse transcribing the total RNA to obtain cDNA; and 
subjecting the cDNA to PCR assay using a set of primers derived from a nucleotide 
sequence of the hSARS. In preferred embodiments, the primers are derived from the 
(nucleocapid) N-gene. In most preferred embodiments, the primers comprise the nucleotide 
sequences of SEQ ID NOS :2475 and/or 2476 or SEQ ID NOS:2480 and/or 248 1 . In 

25 another preferred embodiments, the primers are derived from the (spike) S-gene. In more 
preferred embodiments, the primers comprise the nucleotide sequences of SEQ ED 
NOS:2477 and/or 2478. 

The invention further relates to the use of the sequence information of the isolated 
virus for diagnostic and therapeutic methods. In a specific embodiment, the invention 

30 provides nucleic acid molecules which are suitable for use as primers consisting of or 
comprising the nucleotide sequence of SEQ ID NO:l, 11, 13, 15, 2471, or 2473, or a 
complement thereof, or at least a portion of the nucleotide sequence thereof. In the most 
preferred embodiment, the primers comprise the nucleic acid sequence of SEQ ID 
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NOS:2475 and/or 2476 or SEQ ID NOS:2480 and/or 2481 for the detection of N-gene. In 
another most preferred embodiment, the primers comprise the nucleic acid sequence of SEQ 
ID NO:2477 and/or 2478 for the detection of S-gene. In another specific embodiment, the 
invention provides nucleic acid molecules which are suitable for hybridization to hSARS 
5 nucleic acid, including, but not limited to, as PGR primers, Reverse Transcriptase primers, 
probes for Southern analysis or other nucleic acid hybridization analysis for the detection of 
hSARS nucleic acids, e.g., consisting of or comprising the nucleotide sequence of SEQ ID 
NO:l 3 11, 13, 15, 2471, 2473, 2475, 2476, 2477, 2478, 2480 or 2481, or a complement 
thereof, or a portion thereof. In a preferred embodiment, primers that amplify fragments 

10 comprising (nucleotide position 18057 to 18222 or portions thereof of SEQ ID NO: 15) lb 
gene; (nucleotide position 2 1 920-22 1 07, or portions thereof of SEQ ID NO: 15) M-gene; 
and (nucleotide position 28658-28883, or portions thereof of SEQ ID NO: 15) N-gene may 
be used for probe synthesis for detection of hSARS nucleic acids. In a specific embodiment, 
the invention provides a diagnostic kit comprising nucleic acid molecules which are suitable 

1 5 for use to detect the N-gene of hSARS. In a specific embodiment, the N-gene comprises 
nucleic acid sequence of SEQ ID NO: 2471. In specific embodiment, the nucleic acid 
molecules comprise nucleic acid sequence of SEQ ID NOS:2475 and/or 2476 or SEQ ID 
NOS:2480 and/or 2481. In preferred embodiments, the diagnostic kit further comprises a 
control for amplification of lb gene. In specific embodiments, the primers used for 

20 amplifying lb gene are SEQ ID NOS:3 and/or 4. In another specific embodiments, the 
diagnostic kit further comprises an internal control using pig P-actin gene. In specific 
embodiments, the primers used for amplifying P-actin gene are SEQ ID NOS:2482 and/or 
2483. 

In a specific embodiment, the invention provides a diagnostic kit comprising nucleic 
25 acid molecules which are suitable for use to detect the S-gene of hSARS. In a specific 

embodiment, the S-gene comprises nucleic acid sequence of SEQ ID NO: 2473. In specific 
embodiment, the nucleic acid molecules comprise nucleic acid sequence of SEQ ID NOS: 
2477 and/or 2478. The invention further encompasses chimeric or recombinant viruses 
encoded in whole or in part by said nucleotide sequences. 
30 In another specific embodiment, the invention provides nucleic acid molecules 

comprising a nucleotide sequence of SEQ ID NO.l, 11, 13, 2471, and/or 2473. In a specific 
embodiment, the present invention provides isolated nucleic acid molecules comprising or, 
alternatively, consisting of the nucleotide sequence of SEQ ID NO: 1, a complement thereof 
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or a portion thereof, preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 
350, 400, 450, 500, 550, 600, or more contiguous nucleotides of the nucleotide sequence of 
SEQ ID NO:l, or a complement thereof. In another specific embodiment, the present 
invention provides isolated nucleic acid molecules comprising or, alternatively, consisting 
5 of the nucleotide sequence of SEQ ID NO: 1 1, a complement thereof or a portion thereof, 
preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 
550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 11, or a complement 
thereof. In yet another specific embodiment, the present invention provides isolated nucleic 

10 acid molecules comprising or, alternatively, consisting of the nucleotide sequence of SEQ 
ID NO: 13, a complement thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 
30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 13, or a complement 
thereof In another specific embodiment, the present invention provides isolated nucleic 

15 acid molecules comprising or, alternatively, consisting of the nucleotide sequence of SEQ 
ID NO: 15, a complement thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 
30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 
900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 
9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 

20 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 15, or a complement 
thereof. In another specific embodiment, the present invention provides isolated nucleic 
acid molecules comprising or, alternatively, consisting of the nucleotide sequence of SEQ 
ID NO:2471, a complement thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 

25 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 
900, 950, 1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO:2471, or a complement thereof. In another specific 
embodiment, the present invention provides isolated nucleic acid molecules comprising or, 
alternatively, consisting of the nucleotide sequence of SEQ ID NO:2473, a complement 

30 thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 
200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 
1,100, 1,150, 1,200, 2,000, 3,000, or more contiguous nucleotides of the nucleotide 
sequence of SEQ ID NO:2473, or a complement thereof. Furthermore, in another specific 
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embodiment, the invention provides isolated nucleic acid molecules which hybridize under 
stringent conditions, as defined herein, to a nucleic acid molecule having the sequence of 
SEQ ID NO: 1, 1 1, 13, 15, 2471, or 2473, or a complement thereof. In one embodiment, the 
invention provides an isolated nucleic acid molecule which is antisense to the coding strand 
5 of a nucleic acid of the invention. In another specific embodiment, the invention provides 
isolated polypeptides or proteins that are encoded by a nucleic acid molecule comprising or, 
alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 
45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, or more contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO: 1, or a complement thereof In yet another specific 

1 0 embodiment, the invention provides isolated polypeptides or proteins that are encoded by a 
nucleic acid molecule comprising or, alternatively consisting of a nucleotide sequence that 
is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 
650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous 
nucleotides of the nucleotide sequence of SEQ ID NO: 1 1, or a complement thereof In yet 

15 another specific embodiment, the invention provides isolated polypeptides or proteins that 
are encoded by a nucleic acid molecule comprising or, alternatively consisting of a 
nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, or more contiguous nucleotides of the nucleotide 
sequence of SEQ ID NO: 13, or a complement thereof In yet another specific embodiment, 

20 the invention provides isolated polypeptides or proteins that are encoded by a nucleic acid 
molecule comprising or, alternatively consisting of a nucleotide sequence that is at least 5, 
10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 
750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 
7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 

25 19,000, 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 
more contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 15, or a 
complement thereof. In yet another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising or, 
alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 

30 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 

1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous nucleotides of the nucleotide sequence 
of SEQ ID NO:2471, or a complement thereof. In yet another specific embodiment, the 
invention provides isolated polypeptides or proteins that are encoded by a nucleic acid 
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molecule comprising or, alternatively consisting of a nucleotide sequence that is at least 5, 
10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 
750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO:2473, or a complement 
5 thereof. The invention further provides proteins or polypeptides that are isolated from the 
hSARS virus, including viral proteins isolated from cells infected with the virus but not 
present in comparable uninfected cells. The invention forther provides proteins or 
polypeptides shown in Figures 1 1 (SEQ ID NOS:17-239, 241-736 and 738-1 107) and 12 
(SEQ ID NOS:l 109-1589, 1591-1964 and 1966-2470). The invention further provides 

10 proteins or polypeptides having the amino acid sequence of SEQ ID NO: 2472 or 2474. The 
polypeptides or the proteins of the present invention preferably have a biological activity of 
the protein (including antigenicity and/or immunogenicity) encoded by the sequence of 
SEQ ID NO:l, 1 1, 13, 2471, or 2473. In other embodiments, the polypeptides or the 
proteins of the present invention have a biological acitivity of the protein (including 

15 antigenicity and/or immunogenicity) encoded by a nucleotide sequence that is at least 5, 10, 
15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 
800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 
7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 
19,000, 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or 

20 more contiguous nucleotides of the nucleotide sequence of SEQ ID NO : 15, or a 

complement thereof In other embodiments, the polypeptides or the proteins of the present 
invention have a biological activity of the protein (including antigenicity and/or 
immunogenicity) of Figures 11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 12 (SEQ 
ID NOS: 1 109-1589, 1591-1964 and 1966-2470). The invention further provides proteins or 

25 polypeptides having a biological activity of the protein having amino acid sequence of SEQ 
ID NO: 2472 or 2474. 

In one aspect, the invention provides a method for propagating the hS ARS virus in 
host cells comprising infecting the host cells with the isolated hSARS virus, culturing the 
host cells to allow the virus to multiply, and harvesting the resulting virions. Also provide 

30 by the present invention are host cells that are infected with the hSARS virus. 

In one aspect, the invention relates to the use of the isolated hSARS virus for 
diagnostic and therapeutic methods. In a specific embodiment, the invention provides a 
method of detecting in a biological sample an antibody immuno specific for the hSARS 
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virus using the isolated hSARS virus or any proteins or polypeptides thereof In another 
specific embodiment, the invention provides a method of screening for an antibody which 
immunospecifically binds and neutralizes hSARS. Such an antibody is useful for a passive 
immunization or immunotherapy of a subject infected with hSARS. 
5 The invention further provides antibodies that specifically bind a polypeptide of the 

invention encoded by the nucleotide sequence of SEQ ID NO:l, 11, 13 , 2471, or 2473, or a 
fragment thereof, or encoded by a nucleic acid comprising a nucleotide sequence that 
hybridizes under stringent conditions to the nucleotide sequence of SEQ ID NO:l, 11, 13, 
2471, or 2473, and/or any hSARS epitope, having one or more biological activities of a 

10 polypeptide of the invention. The invention further provides antibodies that specifically 
bind polypeptides of the invention encoded by the nucleotide sequence of SEQ ID NO: 15, 
or a fragment thereof These polypeptides include those shown in Figures 1 1 (SEQ ID 
NOS:17-239, 241-736 and 738-1107) and 12 (SEQ ID NOS:l 109-1589, 1591-1964 and 
1966-2470). In another embodiment, the polypeptide comprises amino acid sequence of 

1 5 SEQ ID NO: 2472, or 2474. The invention further provides antibodies that specifically bind 
polypeptides of the invention encoded by a nucleic acid comprising a nucleotide sequence 
that hybridizes under stringent conditions to the nucleotide sequence of SEQ ID NO: 15, 
and/or any hSARS epitope, having one or more biological activities of a polypeptide of the 
invention. Such antibodies include, but are not limited to polyclonal, monoclonal, bi- 

20 specific, multi-specific, human, humanized, chimeric antibodies, single chain antibodies, 
Fab fragments, F(ab') 2 fragments, disulfide-linked Fvs, intrabodies and fragments 
containing either a VI or VH domain or even a complementary determining region (CDR) 
that specifically binds to a polypeptide of the invention. 

In another embodiment, the invention provides vaccine preparations, comprising the 

25 hSARS virus, including recombinant and chimeric forms of said virus, or protein subunits 
of the virus. In a specific embodiment, the vaccine preparations of the present invention 
comprise live but attenuated hSARS virus with or without adjuvants. In another specific 
embodiment, the vaccine preparations of the invention comprise an inactivated or killed 
hSARS virus. Such attenuated or inactivated viruses may be prepared by a series of 

30 passages of the virus through the host cells or by preparing recombinant or chimeric forms 
of virus. Accordingly, the present invention further provides methods of preparing 
recombinant or chimeric forms of hSARS. In another specific invention, the vaccine 
preparations of the present invention comprise a nucleic acid or fragment of the hSARS 
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virus, e.g., the virus having accession no. CCTCC-V200303, or nucleic acid molecules 
having the sequence of SEQ ID NO. 1, 1 1, 13, 15, 2471 or 2473, or a fragment thereof. In 
another embodiment, the invention provides vaccine preparations comprising one or more 
polypeptides isolated from or produced from nucleic acid of hS ARS virus, for example, of 
deposit accession no. CCTCC-V200303. In a specific embodiment, the vaccine 
preparations comprise a polypeptide of the invention encoded by the nucleotide sequence of 
SEQ ID NO: 1, 1 1, 13 . 2471 or 2473, or a fragment thereof In a specific embodiment, the 
vaccine preparations comprise polypeptides of the invention as shown in Figures 1 1 (SEQ 
IDNOS:17-239, 241-736 and 738-1107) and 12 (SEQ ID NO: 1109-1589, 1591-1964 AND 
1966-2470) or encoded by the nucleotide sequence of SEQ ID NO: 15, or a fragment thereof. 
In a specific embodiment, the vaccine preparations comprise polypeptides comprising 
amino acid sequence of SEQ ID NO:2472 or 2474. Furthermore, the present invention 
provides methods for treating, ameliorating, managing or preventing SARS by 
administering the vaccine preparations or antibodies of the present invention alone or in 
combination with adjuvants, or other pharmaceutically acceptable excipients. 

In another aspect, the present invention provides pharmaceutical compositions 
comprising anti-viral agents of the present invention and a pharmaceutically acceptable 
carrier. In a specific embodiment, the anti-viral agent of the invention is an antibody that 
immunospecifically binds hSARS virus or any hSARS epitope. In preferred embodiments, 
such antibodies neutralize the hSARS virus. In a specific embodiment, the anti-viral agent 
of the invention binds a fragment, variant, homolog of N-gene or S-gene of hSARS virus. 
In a specific embodiment, the anti- viral agent of the invention binds a fragment, variant, 
homolog of a polypeptide comprising the amino acid sequence of SEQ ID NO: 2472 or 
2474. In another specific embodiment, the anti-viral agent is a polypeptide or protein of the 
present invention or nucleic acid molecule of the invention. The invention also provides 
kits containing a pharmaceutical composition of the present invention. 

3.1 Definitions 

The term "an antibody or an antibody fragment that immunospecifically binds a 
polypeptide of the invention" as used herein refers to an antibody or a fragment thereof that 
immunospecifically binds to the polypeptide encoded by the nucleotide sequence of SEQ ID 
NO:l, 11, 13, 15, 2471 2473, or the polypeptides shown in Figures 11 and 12, or a fragment 
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thereof, and does not non-specifically bind to other polypeptides. An antibody or a 
fragment thereof that immunospecifically binds to the polypeptide of the invention may 
cross-react with other antigens. Preferably, an antibody or a fragment thereof that 
immunospecifically binds to a polypeptide of the invention does not cross-react with other 
5 antigens. An antibody or a fragment thereof that immunospecifically binds to the 

polypeptide of the invention, can be identified by, for example, immunoassays or other 
techniques known to those skilled in the art. 

An "isolated" or "purified" peptide or protein is substantially free of cellular material 
or other contaminating proteins from the cell or tissue source from which the protein is 

1 0 derived, or substantially free of chemical precursors or other chemicals when chemically 
synthesized. The language "substantially free of cellular material" includes preparations of 
a polypeptide/protein in which the polypeptide/protein is separated from cellular 
components of the cells from which it is isolated or recombinantly produced. Thus, a 
polypeptide/protein that is substantially free of cellular material includes preparations of the 

15 polypeptide/protein having less than about 30%, 20%, 10%, 5%, 2.5%, or 1%, (by dry 

weight) of contaminating protein. When the polypeptide/protein is recombinantly produced, 
it is also preferably substantially free of culture medium, i.e., culture medium represents 
less than about 20%, 10%, or 5% of the volume of the protein preparation. When 
polypeptide/protein is produced by chemical synthesis, it is preferably substantially free of 

20 chemical precursors or other chemicals, i.e., it is separated from chemical precursors or 
other chemicals which are involved in the synthesis of the protein. Accordingly, such 
preparations of the polypeptide/protein have less than about 30%, 20%), 10%, 5% (by dry 
weight) of chemical precursors or compounds other than polypeptide/protein fragment of 
interest. In a preferred embodiment of the present invention, polypeptides/proteins are 

25 isolated or purified. 

An "isolated" nucleic acid molecule is one which is separated from other nucleic 
acid molecules which are present in the natural source of the nucleic acid molecule. 
Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, can be 
substantially free of other cellular material, or culture medium when produced by 

30 recombinant techniques, or substantially free of chemical precursors or other chemicals 
when chemically synthesized. In a preferred embodiment of the invention, nucleic acid 
molecules encoding polypeptides/proteins of the invention are isolated or purified. The 
term "isolated" nucleic acid molecule does not include a nucleic acid that is a member of a 
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library that has not been purified away from other library clones containing other nucleic 
acid molecules. 

The term "portion" or "fragment" as used herein refers to a fragment of a nucleic 
acid molecule containing at least about 25, 30 3 35, 40, 45, 100, 150, 200, 250, 300, 350, 400, 
5 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 
1300, 1350, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 
13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21,000, 22,000, 23,000, 
24,000, 25,000, 26,000, 27,000, 28,000, 29,000, or more contiguous nucleic acids in length 
of the relevant nucleic acid molecule and having at least one functional feature of the 

10 nucleic acid molecule (or the encoded protein has one functional feature of the protein 
encoded by the nucleic acid molecule); or a fragment of a protein or a polypeptide 
containing at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 120, 
140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340, 360, 400, 500, 600, 700, 800, 900, 
1,000, 1,500, 2,000, 2,500, 3,000, 3,500, 4,000, 4,100, 4,200, 4,300, 4,350, 4,360, 4,370, 

15 4,380 amino acid residues in length of the relevant protein or polypeptide and having at 
least one functional feature of the protein or polypeptide, 

The term "3' region of the hSAR viral genome" refers to from about nucleotide 
position 18,000 to 29742 of SEQ ID NO:15. 

The term "having a biological activity of the protein" or "having biological activities 

20 of the polypeptides of the invention" refers to the characteristics of the polypeptides or 

proteins having a common biological activity similar or identical structural domain and/or 
having sufficient amino acid identity to the polypeptide encoded by the nucleotide sequence 
of SEQ ID NO: 1, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471 or2473. Such common 
biological activities of the polypeptides of the invention include antigenicity and 

25 immunogenicity. 

The term "under stringent condition" refers to hybridization and washing conditions 
under which nucleotide sequences having at least 70%, at least 75%, at least 80%, at least 
85%, at least 90%, or at least 95% identity to each other remain hybridized to each other. 
Such hybridization conditions are described in, for example but not limited to, Current 

30 Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3. 1-6.3.6.; Basic 

Methods in Molecular Biology, Elsevier Science Publishing Co., Inc., N.Y. (1986), pp. 75- 
78, and 84-87; and Molecular Cloning, Cold Spring Harbor Laboratory, N.Y. (1982), pp. 
387-389, and are well known to those skilled in the art. A preferred, non-limiting example 
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of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate 
(SSC), 0.5% SDS at about 68°C followed by one or more washes in 2X SSC, 0.5% SDS at 
room temperature/ Another preferred, non-hmiting example of stringent hybridization 
conditions is hybridization in 6X SSC at about 45°C followed by one or more washes in 
5 0.2X SSC, 0.1% SDS at about 50-65°C. 

The term "variant" as used herein refers either to a naturally occurring genetic 
mutant of hSARS or a recombinantly prepared variation of h SAKS each of which contain 
one or more mutations in its genome compared to the hSARS of CCTCC-V20G303. The 
term "variant" may also refers either to a naturally occurring variation of a given peptide or 
10 a recombinantly prepared variation of a given peptide or protein in which one or more 

amino acid residues have been modified by amino acid substitution, addition, or deletion. 

4. BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 shows a partial DNA sequence (SEQ ID NO: 1) and its deduced amino acid 
15 sequence (SEQ ID NO:2) obtained from the SARS virus that has 57% homology to the 
RN A- dependent RNA polymerase protein of known Cor onavir uses . 

Figure 2 shows an electron micrograph of the novel hS ARS virus that has similar 
morphological characteristics of coronaviruses. 

Figure 3 shows an immunofluorescent staining for IgG antibodies that are 
20 specifically bound to the FrHK-4 cells infected with the novel human respiratory virus of 
Coronaviridae. 

Figure 4 shows an electron micrograph of ultra-centrifuged deposit of hSARS virus 
that was grown in the cell culture and negatively stained with 3% potassium phospho- 
tungstate at pH 7.0. 

25 Figure 5 A shows a thin-section electron micrograph of lung biopsy of a patient with 

SARS; and Figure 5B shows a thin section electron micrograph of hSARS-infected cells. 

Figure 6 shows the result of phylogenetic analysis for the partial protein sequence 
(215 amino acids; SEQ ID NO: 2) of the hSARS virus (GenBank accession number 
AY268070). The phylogenetic tree is constructed by the neighbor-jointing method. The 

30 horizontal-line distance represents the number of sites at which the two sequences compared 
are different. Bootstrap values are deducted from 500 replicates. 
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Figure 7A shows an amplification plot of fluorescence intensity against the PGR 
cycle in a real-time quantitative PGR assay that can detect a hSARS virus in samples 
quantitatively. The copy numbers of input plasmid DNA in the reactions are indicated. The 
X-axis denotes the cycle number of a quantitative PGR assay and the Y-axis denotes the 
5 fluorescence intensity (FI) over the backgroud. Figure 7B shows the result of a melting 

curve analysis of PGR products from clinical samples. Signals from positive (+ve) samples, 
negative (-ve) samples and water control (water) are indicated. The X-axis denotes the 
temperature (°C) and the Y-axis denotes the fluorescence intensity (Fl) over the 
background. 

1 0 Figure 8 shows another partial DNA sequence (SEQ ID NO: 11) and its deduced 

amino acid sequence (SEQ ID NO: 12) obtained from the SARS virus. 

Figure 9 shows yet another partial DNA sequence (SEQ ID NO: 13) and its deduced 
amino acid sequence (SEQ ID NO: 14) obtained from the SARS virus. 

Figure 10 shows the entire genomic DNA sequence (SEQ ID NO: 15) of the SARS 

15 virus. 

Figure 1 1 shows the deduced amino acid sequences obtained from SEQ ID NO: 15 in 
three frames (see SEQ ID NOS: 16, 240 and 737). An asterisk (*) indicates a stop codon 
which marks the end of a peptide. The first-frame amino acid sequences: SEQ ID NOS: 17- 
239; the second-frame amino acid sequences: SEQ ID NOS:241-736; and the third-frame 
20 amino acid sequences: SEQ ID NO:738-l 107. 

Figure 12 shows the deduced amino acid sequences obtained from the complement 
of SEQ ID NO: 15 in three frames (see SEQ ID NOS: 1 108, 1 590 and 1965). An asterisk (*) 
indicates a stop codon which marks the end of a peptide. The first-frame amino acid 
sequences: SEQ ED NOS: 1109-1589; the second-frame amino acid sequences: SEQ ID 
25 NOS:1591-1964; and the third-frame amino acid sequences: SEQ ID NO:1966-2470. 

Figure 13 shows the N-gene primer sequences (which amplifies nucleotides at 
position 29247-29410 of SEQ ID NO:2471). 150# (SEQ ID NO:2475); 200# (SEQ ID 
NO: 2476); and S-gene primer sequences (which amplifies nucleotides at position 24751 to 
25049 of SEQ ID NO:2473). 13 1# (SEQ ID NO:2477); 132# (SEQ ID NO:2478). 
30 Figure 14A shows the nucleic acid sequence of the N-gene (SEQ ID NO:2471). 

Figure 14B shows the amino acid sequence of the N-gene (SEQ ID NO: 2472). 

Figure 15 A shows the nucleic acid sequence of the S-gene (SEQ ID NO: 2473). 
Figure 15B shows the amino acid sequence of the S-gene (SEQ ID NO:2474). 
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Figure 16 shows the genome organization and transcription strategy of SARS-CoV 
HK-39. Genomic and mRNA transcripts are capped (black circles), carry leader sequences 
(vertical lines)at 5' proximal and are polyadenylated (A 15 ). Arrows point the position of the 
intergenic sequence, 5 ' -CTAAACGAAC-3 ' (SEQ ID NO:2479). After release of the 
5 positive-sense genomic RNA in the cytoplasm of host cell, the viral RNA-dependent RNA 
polymerase, encoded from ORF la and lb, is synthesized. It carries out transcription of a 
full-length complementary (negative-sense) RNA, from which new genomic RNA, an 
overlapping set of subgenomic mRNA transcripts, and leader RNA are synthesized. Note 
that all transcripts are preceded with common 5 ' leader sequences and common 3 ? ends. 

10 ORF la and lb - RNA-dependent RNA polymerase; S - the major peplomer glycoprotein; 
M - transmembrane glycoprotein; N - nucleocapsid; XI, X2, X3 - putative proteins. 

Figure 17 shows a construct map of pSARS Co V-ORF 1 b-N. PGR products 
amplified from ORFlb (lb) and N gene of SARS-CoV were co-ligated into a cloning vector 
pCR2. 1-TOPO (Invitrogen). The nucleotide (nt) numbers corresponds to the positions in 

15 the sequence of HK-39 strain SARS-CoV (AY278491). Shadowed areas indicate the 
amplicons by the primers used in diagnostic test (i.e., SEQ ID NOS:2480 and 2481). 

Figures 1 8 shows a photo of an agarose gel after electrophoresis of total RNA 
extracted from SARS patients using SV Total RNA isolation system. The extracted RNA 
was then subjected to a reverse-transcription polymerase chain reaction (RT-PCR) assay for 

20 the detection of coronavirus in the patients. 

Figure 1 9 shows the effect of potential inhibitors in Reverse Transcription 
Polymerase Chain Reaction (RT-PCR). To remove potential inhibitors, total RNA eluted 
from SV96 Binding Plate was precipitated with 95 % ethanol and 3 M sodium acetate and 
resuspend in 12 jjlI of nuclease-free water. RT-PCR was performed with actin-F (SEQ ID 

25 NO:2482) and actin-R (SEQ ID NO:2483) primers. Numbers indicated were the number of 
pig kidney epithelial (PK-15) cell added in the sample as an internal control. There was no 
DNA fragment amplified with untreated RNA samples. 

Figure 20 shows the primers used for amplifying various genes. SRS25 1 (SEQ ID 
NO:2480) and SRS252 (SEQ ID NO:2481) amplified a 225 base pair fragment from the 

30 region of N-gene that showed no homology to other coronavirus. coro3 (SEQ ID NO: 3) 
and coro4 (SEQ ID NO: 4) amplified RNA-dependent RNA polymerase (lb gene) as a 
control Actin-F (SEQ ID NO:2482) and actin-R (SEQ ID NO:2483) amplified a 745 base 
pair fragment from P-actin gene as internal control for PGR assays. 

15 
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Figure 21 A shows Amplification plot of fluorescence Intensity against the number 
of PGR cycles. Black lines show the dynamic range of N-gene specific PCR with serially 
diluted plasmid construct from 10 1 to 10 6 copies. NPA samples from non-SARS patients, 
including patients suffering from adenovirus (n == 5), respiratory syncytial virus (n = 5), 
5 human metapneumovirus (n = 5), influenza A virus (n = 5), or influenza B virus (n = 5) 
infection, are shown in gray lines. Lines with triangles denotes the SARS-CoV positive 
NPA samples; NTC represents no template control; X-axis indicates the cycle number of 
quantitative PCR performed, while Y-axis represents the fluorescence intensity (FAM-400) 
over background signal (Delta Rn). Inlet shows the melting curve analysis of the PCR 

10 products. Signals from positive (+ve), negative (-ve) samples and no template control are 
indicated. X-axis indicates the temperature (°C), while Y-axis represents the fluorescence 
intensity (Delta Rn). Figure 21B shows comparison of dynamic ranges of N-gene and lb- 
gene specific PCRs. Dynamic ranges of both N-gene and lb-gene PCR were obtained with 
same plasmid construct in which 1:1 ratio of corresponding amplicons were subcloned. 

15 Serially diluted plasmid with copy number ranged from 10' 1 to 10 5 copies was used as 

template in both PCRs, Lines with triangles denotes N-gene specific PCR while the gray 
lines indicates lb-gene specific PCR. Inlet shows Ct values ± standard deviation in 
triplicate set of experiment of both PCRs with different copy numbers of template used. 
NTC represents no template control; X-axis indicated the cycle number of quantitative PCR 

20 performed, while Y-axis represents fluorescence intensity. 

Figure 22A and 22B show an amplification curve and a melting curve, respectively, of 
real-time quantitative PCR specific to lb (using the primers having SEQ ID NOS:3 and 4) and 
N gene (using the primers having SEQ ID NOS:2480 and 2481) of SARS CoV. Fig. 22A: 
Amplification plot of fluorescence intensity against the number of PCR cycles. One (1) |_il 

25 of cDNA from a NPA, tracheal dispersion and lung biopsy of patients with clinical 

symptoms were used as template in each PCR. Fifty (50) cycles of PCR were performed to 
achieve the saturation phase of the reaction. X-axis indicates the cycle number of 
quantitative PCR performed, while y-axis represents the fluorescence intensity (FAM-490) 
over background signal. Horizontal gray line indicates the calculated threshold value by 

30 maximum curvature approach, and the baseline cycle Ct was calculated automatically. Inlet 
shows half-maximal fluorescence value (1/2 max) and Ct of both PCR with cDNA from 
various tissue isolated from a key patient (patient A indicated in New Engl J. Med. 
348; 1967-76 (by Drosten et al 9 2003) in three different time points. NPA = nasopharyngeal 
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aspirate; TW = tracheal wash; LW = lung wash. Fig. 22B: Melting curves of PGR products. 
Melting curve analysis was carried out after 10-minute further-extension step of the reaction. 
The temperature was raised from 56°C to 94°C by 76 increments of 0.5°C each, while each 
set-point temperature had been held for 7 seconds for data collection and analysis. Melting 
5 temperature of lb- and N-gene specific PGR products was 80.5°C and 85.5°C respectively. 
X-axis indicates the temperature in degree Celsius while Y-axis represents the fluorescence 
intensity (FAM-490) over background signal. One (1) joi of water was used as no template 
control in the reaction. 

Figure 23 shows the diagnostic result of 48 clinical samples using the primers 

10 having SEQ ID NO S: 2480 and 2481, respectively, with P-actin PGR as an internal control. 
Upper bands in each row showed a 745 bp DNA fragment amplified with actin-F and actin- 
R, while lower bands were the amplicons by primers specific for N-gene of SARS 
coronavirus (225 bp), -ve control (water) and +ve control (cDNA from SARS coronavirus 
infected vero cell) of the assay were indicated. Five (5) \xl of PGR products of both 

15 reactions were mixed and loaded into the sample well in a 2 % agarose gel M = 1 kb plus 
molecular marker (Invitrogen). 

Figure 24 shows Northern Blot analysis of SARS-CoV total RNA. Total RNA of 
SARS-CoV was extracted from SARS-CoV infected Vero E6 cell. RNA was separated in a 
1 % denaturing gel containing 6.29 % formaldehyde. Afterwards RNA was transferred to 

20 positively charged nylon membrane and hybridized with digoxigenin-labelled PGR 

fragments specific to lb ? S 3 M and N genes, respectively. Lane 1 - lb; lane 2 - S; lane 3 - 
M; lane 4 - N. Vertical bar showed the molecular size reference. Arrows indicates the 
transcripts hybridized with N probe. Signals were analyzed by chemiluminescence. 

Figure 25 shows the DNA probes used in Nothern blot analysis. The probes for lb 

25 gene (nt 18057-18222; SEQ ID NO:2484), S gene (nt 21920-22107; SEQ ID NO:2485) ? M 
gene (nt 25867-26996; SEQ ID NO:2486), and N gene (nt 28658-28883; SEQ ID NO:2487) 
are shown. 



5. DETAILED DESCRIPTION OF THE INVENTION 

30 The present inventors developed a rapid, high-throughput reverse transcription-PCR 

diagnostic test for SARS associated coronavirus (SARS-CoV). 3 5 region of the hSARS 
virus genome including the Nucleocapsid gene (N-gene) represents a sensitive molecular 
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marker which can be used in addition to lb gene to increase the sensitivity of the test. An 
internal control using PK-15 cells may be employed to ensure the integrity of RNA during 
its extraction process and cDNA synthesis, thus eliminating false negative results. 

In mouse hepatitis virus (MHV), atypical member of the genus Coronavirus, both 
5 genomic RNA and mRNA transcripts are capped and with common 3 ' ends and common 
leader sequences on their 5' ends. With this unique transcription strategy, the copy numbers 
of different viral genes during proliferation of virus in its host are different (Figure 19), N 
gene that encodes for the nucleocapsid has the most abundant copy number during virus 
replication as all transcripts may carry nucleotide sequence from N gene, although they are 

10 not all in-frame for translation for this gene product. The present inventors have discovered 
a diagnostic assay that is based on the 3' region, including the N-gene, of the viral genome 
provide a more sensitive assay than the rest of the viral genome. Accordingly, in preferred 
embodiments, nucleic acid molecules that may be used for a diagnostic assay comprise 
nucleic acid sequence of nucleotide position 18000 to 29742 of SEQ ID NO: 15, or portions 

15 thereof. The portions may comprise 15, 20, 25, 30 , 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, 750, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 
1200 of nucleotides having nucleic acid sequence from nucleotide position 18000 to 29742 
of SEQ ED NO: 15. In other preferred embodiments, nucleic acid molecules that may be 
used for a diagnostic assay comprising nucleic acid sequence of nucleotide position 28658 

20 to 28883 or 29247-29410 of SEQ ID NO: 15. 

Nasopharyngeal aspirate (NPA) and stool samples were obtained from SARS 
suspected patients with major clinical symptoms and significant history of close contact 
with infected patients. Total RNA was extracted from the subject samples, together with 
PK-15 cell as an internal control. Samples were analyzed by the reverse-transcription-PCR 

25 assay. Northern blot analysis was performed to show different subgenomic transcripts of 
the virus. Real-time quantitative PCR was employed to compare the sensitivity of two loci 
used in this diagnostic assay. In specific embodiments, PCR inhibitor was removed with 
ethanol precipitation after RNA extraction process. 

In preferred embodiments, the present invention provides a method for detecting the 

30 presence or absence of nucleic acid of the N-gene in a biological sample. The method 

involves obtaining a biological sample from various sources and contacting the sample with 
a compound or an agent capable of detecting a nucleic acid (e.g., mRNA, genomic RNA) of 
the N-gene of the hS ARS virus such that the presence of the N-gene is detected in the 
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sample. In preferred embodiments, the N-gene may be detected using a labeled nucleic acid 
probe comprising of the nucleotide sequence of SEQ ID NO: 2471, complement thereof, or a 
portion thereof. The portion may be 10, 20, 30, 40, 50, 100, 200, 400, 500, 600, 800, 1000, 
1200 nucleotides in length. In preferred embodiments, primers comprising nucleotide 
5 sequence of SEQ ID NOS:2475 and/or 2476 or SEQ ID NOS:2480 and/or 2481 may be 
used to amplify a portion of the N-gene for detection. 

A preferred agent for detecting hSARS mRNA or genomic RNA of the invention is 
a labeled nucleic acid probe capable of hybridizing to mRNA or genomic RNA encoding a 
polypeptide of the invention. The nucleic acid probe can be, for example, a nucleic acid 
10 molecule comprising or consisting of the nucleotide sequence or SEQ ID NO: 1, 1 1, 13, 15, 
2471 or 2473, complement thereof, or a portion thereof, such as an oligonucleotide of at 
least 15, 20, 25, 30, 50, 100, 250, 500, 750, 1,000 or more contiguous nucleotides in length 
and sufficient to specifically hybridize under stringent conditions to a hSARS mRNA or 
genomic RNA. 

15 In another preferred specific embodiment, the presence of N-gene is detected in the 

sample by an reverse transcription polymerase chain reaction (RT-PCR) using the primers 
that are constructed based on a partial nucleotide sequence of the N-gene or a genomic 
nucleic acid sequence of SEQ ID NO : 15, or based on a nucleotide sequence of SEQ ID 
NO:l, 11, 13, 15, 2471, or 2473. In a non-limiting specific embodiment, preferred primers 

20 to be used in a RT-PCR method are: 5'-TACACACCTCAGC-GTTG-3' (SEQ ID NO:3) 
and/or 5 5 -CACGAACGTGACG-AAT-3 ' (SEQ ID NO:4), in the presence of 2.5 mM 
MgCl 2 and the thermal cycles are, for example, but not limited to, 94 °C for 8 min followed 
by 40 cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min (also see Sections 6.7 
and 6.8 infra). In preferred embodiments, the primers comprise nucleic acid sequence of 

25 SEQ ID NOS:2475 and 2476. In another preferred embodiment, the primers comprise 
nucleic acid sequence of SEQ ID NOS:2480 and 2481. In preferred embodiments, the 
thermal cycles are 94 °C for 10 min followed by 40 cycles of 94 °C for 30 seconds, 56 °C 
for 30 seconds, 72 °C for 30 seconds, 72°C for 10 minutes. In another preferred 
embodiment, the thermal cycles are 94 °C for 3 min followed by 40 cycles of 94 °C for 30 

30 seconds, 56 °C for 30 seconds, 72 °C for 30 seconds, 72°C for 10 minutes. In more 

preferred specific embodiment, the present invention provides a real-time quantitative PGR 
assay to detect the presence of hSARS virus in a biological sample by subjecting the cDNA 
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obtained by reverse transcription of the extracted total RNA from the sample to PGR 
reactions using the specific primers, such as those having nucleotide sequences of SEQ ID 
NOS:3 and 4, and a fluorescence dye, such as SYBR® Green I, which fluoresces when 
bound non-specifically to double-stranded BNA. The fluorescence signals from these 
5 reactions are captured at the end of extension steps as PGR product is generated over a 

range of the thermal cycles, thereby allowing the quantitative determination of the viral load 
in the sample based on an amplification plot (see Section 6.7, infra). 

In the preferred embodiment, the present invention provides a method for detecting 
the presence or absence of nucleic acid of the S-gene in a biological sample. The method 

10 involves obtaining a biological sample from various sources and contacting the sample with 
a compound or an agent capable of detecting a nucleic acid (e.g., mRNA, genomic RNA) of 
the S-gene of the hSARS virus such that the presence of the S-gene is detected in the 
sample. A preferred agent for detecting hS ARS mRNA or genomic RNA of the invention is 
a labeled nucleic acid probe capable of hybridizing to mRNA or genomic RNA encoding a 

15 polypeptide of the invention. The nucleic acid probe can be, for example, a nucleic acid 
molecule comprising or consisting of the nucleotide sequence or SEQ ID NO: 1, 11, 13, 15, 
2471, or 2473, or a portion thereof, such as an oligonucleotide of at least 15, 20, 25, 30, 50, 
100, 250, 500, 750, 1,000 or more contiguous nucleotides in length and sufficient to 
specifically hybridize under stringent conditions to a hSARS mRNA or genomic RNA. 

20 In another preferred specific embodiment, the presence of S-gene is detected in the 

sample by an reverse transcription polymerase chain reaction (RT-PCR) using the primers 
that are constructed based on a partial nucleotide sequence of the S-gene (SEQ ID 
NO:2473). 

In vitro techniques for detection of mRNA include northern hybridizations, in situ 
25 hybridizations, RT-PCR, and RNase protection. In vitro techniques for detection of 
genomic RNA include nothern hybridizations, RT-PCT, and RNase protection. 

The polynucleotides encoding the N-gene may be amplified before they are detected. 
The term "amplified" refers to the process of making multiple copies of the nucleic acid 
from a single polynucleotide molecule. The amplification of polynucleotides can be carried 
30 out in vitro by biochemical processes known to those of skill in the art. The amplification 
agent may be any compound or system that will function to accomplish the synthesis of 
primer extension products, including enzymes. Suitable enzymes for this purpose include, 
for example, E. coli DNA polymerase I, Taq polymerase, Klenow fragment ofE. colt DNA 
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polymerase I, T4 DNA polymerase, other available DNA polymerases, polymerase niuteins, 
reverse transcriptase, ligase, and other enzymes, including heat-stable enzymes (i.e., those 
enzymes that perform primer extension after being subjected to temperatures sufficiently 
elevated to cause denaturation). Suitable enzymes will facilitate combination of the 
5 nucleotides in the proper manner to form the primer extension products that are 

complementary to each mutant nucleotide strand. Generally, the synthesis will be initiated 
at the 3' -end of each primer and proceed in the 5 3 -direction along the template strand, until 
synthesis terminates, producing molecules of different lengths. There may be amplification 
agents, however, that initiate synthesis at the 5 ? -end and proceed in the other direction, 

10 using the same process as described above. In any event, the method of the invention is not 
to be limited to the embodiments of amplification described herein. 

One method of in vitro amplification, which can be used according to this invention, 
is the polymerase chain reaction (PGR) described in U.S. Patent Nos. 4,683,202 and 
4,683, 195. The term "polymerase chain reaction" refers to a method for amplifying a DNA 

15 base sequence using a heat- stable DNA polymerase and two oligonucleotide primers, one 
complementary to the (-f-)-strand at one end of the sequence to be amplified and the other 
complementary to the (-)-strand at the other end. Because the newly synthesized DNA 
strands can subsequently serve as additional templates for the same primer sequences, 
successive rounds of primer annealing, strand elongation, and dissociation produce rapid 

20 and highly specific amplification of the desired sequence. The polymerase chain reaction is 
used to detect the presence of polynucleotides encoding cytokines in the sample. Many 
polymerase chain methods are known to those of skill in the art and may be used in the 
method of the invention. For example, DNA can be subjected to 30 to 35 cycles of 
amplification in a thermocycler as follows: 95°C for 30 sec, 52° to 60°C for 1 min, and 

25 72°C for 1 min, with a final extension step of 72°C for 5 min. For another example, DNA 
can be subjected to 35 polymerase chain reaction cycles in a thermocycler at a denaturing 
temperature of 95 °C for 30 sec, followed by varying annealing temperatures ranging from 
54-58°C for 1 min, an extension step at 70°C for 1 min and a final extension step at 70°C. 
The primers for use in amplifying the N-gene or S-gene of the invention may be 

30 prepared using any suitable method, such as conventional phosphotriester and 

phosphodiester methods or automated embodiments thereof so long as the primers are 
capable of hybridizing to the polynucleotides of interest. One method for synthesizing 
oligonucleotides on a modified solid support is described in U.S. Patent No. 4,458,066. The 
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exact length of primer will depend on many factors, including temperature, buffer, and 
nucleotide composition. The primer must prime the synthesis of extension products in the 
presence of the inducing agent for amplification. 

Primers used according to the method of the invention are complementary to each 
5 strand of nucleotide sequence to be amplified. The term "complementary" means that the 
primers must hybridize with their respective strands under conditions, which allow the 
agent for polymerization to function. In other words, the primers that are complementary to 
the flanking sequences hybridize with the flanking sequences and permit amplification of 
the nucleotide sequence. Preferably, the 3 ? terminus of the primer that is extended has 

10 perfectly base paired complementarity with the complementary flanking strand. Primers 
and probes for polynucleotides encoding N-gene or S-gene of the present invention can be 
developed using known methods combined with the present disclosure. 

Those of ordinary skill in the art will know of various amplification methodologies 
that can also be utilized to increase the copy number of target nucleic acid. The 

15 polynucleotides detected in the method of the invention can be further evaluated, detected, 
cloned, sequenced, and the like, either in solution or after binding to a solid support, by any 
method usually applied to the detection of a specific nucleic acid sequence such as another 
polymerase chain reaction, oligomer restriction (Saiki et al. y Bio/Technology 3:1008-1012 
(1985)), allele-specific oligonucleotide (ASO) probe analysis (Conner et al, Proc. Natl 

20 Acad Set USA 80: 278 (1983), oligonucleotide ligation assays (OLAs) (Landegren et al., 
Science 241:1011 (1988)), KNAse Protection Assay and the like. Molecular techniques for 
DNA analysis have been reviewed (Landegren et al y Science 242: 229-237 (1988)). 
Following DNA amplification, the reaction product may be detected by Southern blot 
analysis, without using radioactive probes. In such a process, for example, a small sample 

25 of DNA containing the polynucleotides obtained from the tissue or subject are amplified, 

and analyzed via a Southern blotting technique. The use of non-radioactive probes or labels 
is facilitated by the high level of the amplified signal. In one embodiment of the invention, 
one nucleoside triphosphate is radioactively labeled, thereby allowing direct visualization of 
the amplification product by autoradiography. In another embodiment, amplification 

30 primers are fluorescently labeled and run through an electrophoresis system. Visualization 
of amplified products is by laser detection followed by computer assisted graphic display, 
without a radioactive signal. 
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The methods of the present invention can involve a real-time quantitative PGR assay, 
such as a Taqman® assay (Holland et al., Proc Natl Acad Sci USA, 88(16):7276 (1991); 
also see U.S. Patent Application of Attorney Docket No. V9661.0078 filed March 24, 2004, 
which is incorporated by reference in its entirety). The assays can be performed on an 
5 instrument designed to perform such assays, for example those available from Applied 
Biosystems (Foster City, CA). Primers and probes for such an assay can be designed 
according to known procedures in the art. 

The size of the primers used to amplify a portion of the N-gene or S-gene is at least 
10, 15, 20, 25, 30 nucleotide in length. In particular, primers that amplify the N-gene or S- 
10 gene is most preferred. Preferably, the GC ratio should be above 30, 35, 40, 45, 50, 55, 60 
% so as to prevent hair-pin structure on the primer. Furthermore, the amplicon should be 
sufficiently long enough to be detected by standard molecular biology methodologies. 
Preferably, the amplicon is at least 40, 60, 100, 200, 300, 400, 500, 600, 800, 1000 base pair 
in length. 

15 In a specific embodiment, the methods further involve obtaining a control sample 

from a control subject, contacting the control sample with a compound or agent capable of 
detecting N-gene or S-gene, such that the presence of mRNA or genomic RNA encoding the 
N-gene or S-gene is detected in the sample, and comparing the presence (or absence) of N- 
gene or S-gene, or mRNA or genomic RNA encoding the polypeptide in the control sample 

20 with the presence of N-gene or S-gene, or mRNA or genomic DNA encoding the 
polypeptide in the test sample. 

The invention also encompasses kits for detecting the presence of N-gene nucleic 
acid in a test sample. The kit, for example, can comprise a labeled compound or agent 
capable of detecting a nucleic acid molecule encoding the polypeptide in a test sample and, 

25 in certain embodiments, a means for determining the amount of mRNA in the sample (an 
oligonucleotide probe which binds to DNA or mRNA encoding the polypeptide). 

For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
acid sequence encoding a polypeptide of the invention or to a sequence within the N-gene; 

30 (2) a pair of primers useful for amplifying a nucleic acid molecule containing the N-gene 
sequence. The kit can also comprise, e.g., a buffering agent, a preservative, or a protein 
stabilizing agent. The kit can also comprise components necessary for detecting the 
detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample 
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or a series of control samples which can be assayed and compared to the test sample 
contained. Each component of the kit is usually enclosed within an individual container and 
all of the various containers are within a single package along with instructions for use. 

The present invention relates to the isolated N-gene and S-gene of the hSARS virus. 
5 In a specific embodiment, the virus comprises a nucleotide sequence of SEQ ID NO: 1, 1 1, 
13, 15, 2471, and/or 2473. In a specific embodiment, the present invention provides 
isolated nucleic acid molecules of the hSARS virus, comprising, or, alternatively, consisting 
of the nucleotide sequence of SEQ ID NO:l, 11, 13, 15, 2471, and/or, 2473, a complement 
thereof or a portion thereof. In another specific embodiment, the invention provides 

10 isolated nucleic acid molecules which hybridize under stringent conditions, as defined 

herein, to a nucleic acid molecule having the sequence of SEQ ID NO: 1, 11, 13, 15, 2471, 
and/or 2473, or specific genes of known member of Cor onaviridae, or a complement 
thereof In another specific embodiment, the invention provides isolated polypeptides or 
proteins that are encoded by a nucleic acid molecule comprising a nucleotide sequence that 

15 is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 
600, or more contiguous nucleotides of the nucleotide sequence of SEQ ID NO:l, or a 
complement thereof In another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising a 
nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 

20 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 
1,150, 1,200, or more contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 1 1, 
or a complement thereof. In yet another specific embodiment, the invention provides 
isolated polypeptides or proteins that are encoded by a nucleic acid molecule comprising a 
nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 

25 300, 350, 400, 450, 500, 550, 600, 650, 700, or more contiguous nucleotides of the 

nucleotide sequence of SEQ ID NO: 13, or a complement thereof. In yet another specific 
embodiment, the invention provides isolated polypeptides or proteins that are encoded by a 
nucleic acid molecule comprising or, alternatively consisting of a nucleotide sequence that 
is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 

30 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200 or more contiguous 
nucleotides of the nucleotide sequence of SEQ ID NO:2471, or a complement thereof. In 
yet another specific embodiment, the invention provides isolated polypeptides or proteins 
that are encoded by a nucleic acid molecule comprising or, alternatively consisting of a 
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nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 
1,200, 2,000, 3,000, or more contiguous nucleotides of the nucleotide sequence of SEQ ID 
NO: 2473, or a complement thereof In yet another specific embodiment, the invention 
5 provides isolated polypeptides or proteins that are encoded by a nucleic acid molecule 

comprising or, alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 
25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 
900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 
9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 

10 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 15, or a complement 
thereof The polypeptides includes those shown in Figures 1 1 (SEQ ID NOS: 17-239, 241- 
736 and 738-1107) and 12 (SEQ ID NOS: 1 109-1589, 1591-1964 and 1966-2470) or having 
an amino acid sequence of SEQ ID NO: 2472 or 2474. The polypeptides or the proteins of 

1 5 the present invention preferably have one or more biological activities of the proteins 

encoded by the sequence of SEQ ID NO: 1, 11,13,15, 2471 or 2473, or the polypeptides 
shown in Figures 1 1 and 12, or the native viral proteins containing the amino acid 
sequences encoded by the sequence of SEQ ID NO: 1, 11, 13, 15, 2471 or2473. 

The present invention also relates to a method for propagating the hSARS virus in 

20 host cells. 

The invention further relates to the use of the sequence information of the isolated 
virus for diagnostic and therapeutic methods. In a specific embodiment, the invention 
provides the entire nucleotide sequence of hSARS virus, CCTCC-V200303, SEQ ID NO: 15, 
or fragments, or complement thereof. Furthermore, the present invention relates to a 

25 nucleic acid molecule that hybridizes any portion of the genome of the hS ARS virus, 
CCTCC-V200303, or SEQ ID NO:15, under the stringent conditions. In a specific 
embodiment, the invention provides nucleic acid molecules which are suitable for use as 
primers consisting of or comprising the nucleotide sequence of SEQ ID NO:l, 11, 13, 15, 
2471 or 2473, or a complement thereof, or a portion thereof. In specific embodiments, the 

30 primers comprise nucleotide sequence of SEQ ID NO: 2475, 2476, 2477, 2478, 2480 or 

2481. In another specific embodiment, the invention provides nucleic acid molecules which 
are suitable for use as hybridization probes for the detection of nucleic acids encoding a 
polypeptide of the invention, consisting of or comprising the nucleotide sequence of SEQ 
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ID NO: 1,11, 13, 15, 2471 or 2473, a complement thereof, or a portion thereof. The 
invention further relates to a kit comprising primers having nucleic acid sequence of SEQ 
ID NOS:2475 and 2476; and SEQ ID NOS:2480 and 2481, for the detection of N-gene. In 
another embodiment, the invention relates to a kit comprising primers having nucleic acid 
5 sequence of SEQ ID NOS:2477 and/or 2478 for the detection of S-gene. In a preferred 
embodiment, the kit further comprises reagents for the detection of genes not found in 
hSARS virus as a negative control. The invention further encompasses chimeric or 
recombinant viruses or viral proteins encoded by said nucleotide sequences. 

The invention further provides antibodies that specifically bind a polypeptide of the 

10 invention encoded by the nucleotide sequence of SEQ ID NO:l, 11, 13, 2471 or 2473, or a 
fragment thereof, or any hSARS epitope. The invention further provides antibodies that 
specifically bind a polypeptide having amino acid sequence of SEQ ID NO: 2472 or 2474. 
The invention further provides antibodies that specifically bind the polypeptides of the 
invention encoded by the nucleotide sequence of SEQ ID NO: 15, or the polypeptides shown 

15 in Figures 1 1 and 12, or a fragment thereof, or any hSARS epitope. Such antibodies include, 
but are not limited to polyclonal, monoclonal, bi-specific, multi-specific, human, humanized, 
chimeric antibodies, single chain antibodies, Fab fragments, F(ab') 2 fragments, disulfide- 
linked Fvs, intrabodies and fragments containing either a VI or VH domain or even a 
complementary determining region (CDR) that specifically binds to a polypeptide of the 

20 invention. 

In one embodiment, the invention provides methods for detecting the presence, 
activity or expression of the hS ARS virus of the invention in a biological material, such as 
cells, blood, saliva, urine, sputum, nasopharyngeal aspirates, and so forth. The presence of 
the hSARS virus in a sample can be determined by contacting the biological material with 

25 an agent which can detect directly or indirectly the presence of the hSARS virus. In a 

specific embodiment, the detection agents are the antibodies of the present invention. In 
another embodiment, the detection agent is a nucleic acid of the present invention. 

In another embodiment, the invention provides vaccine preparations comprising the 
hS ARS virus, including recombinant and chimeric forms of said virus, or subunits of the 

30 virus. In a specific embodiment, the vaccine preparations comprise live but attenuated 

hS ARS virus with or without pharmaceutical^ acceptable excipients, including adjuvants. 
In another specific embodiment, the vaccine preparations comprise an inactivated or killed 
hS ARS virus with or without pharmaceutically acceptable excipients, including adjuvants. 
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The vaccine preparations of the present invention may further comprise with adjuvants 
orAccordingly, the present invention further provides methods of preparing recombinant or 
chimeric forms of hSARS. In another specific invention, the vaccine preparations of the 
present invention comprise one or more nucleic acid molecules comprising or consisting of 
5 the sequence of SEQ ID NO. 1, 1 1, 13, 15, 2471 and/or 2473, or a fragment thereof. In 
another embodiment, the invention provides vaccine preparations comprising one or more 
polypeptides of the invention encoded by a nucleotide sequence comprising or consisting of 
the nucleotide sequence of SEQ ID NO:l, 11, 13, 2471 and/or 2473, or the polypeptides 
shown in Figures 1 1 and 12, or a fragment thereof. In another embodiment, the invention 

10 provides vaccine preparations comprising one or more polypeptides of the invention 

encoded by a nucleotide sequence comprising or consisting of the nucleotide sequence of 
SEQ ID NO: 1 5, or a fragment thereof. Furthermore, the present invention provides 
methods for treating, ameliorating, managing, or preventing S ARS by administering the 
vaccine preparations or antibodies of the present invention alone or in combination with 

15 antivirals [e.g., amantadine, rimantadine, gancyclovir, acyclovir, ribavirin, penciclovir, 

oseltamivir, foscarnet zidovudine (AZT), didanosine (ddl), lamivudine (3TC), zalcitabine 
(ddC), stavudine (d4T), nevirapine, delavirdine, indinavir, ritonavir, vidarabine, nelfinavir, 
saquinavir, relenza, tamiflu, pleconaril, interferons, etc.] 5 steroids and corticosteroids such 
as prednisone, cortisone, fluticasone and glucocorticoid, antibiotics, analgesics, 

20 broncho dialaters, or other treatments for respiratory and/or viral infections. 

Furthermore, the present invention provides pharmaceutical compositions 
comprising anti-viral agents of the present invention and a pharmaceutically acceptable 
carrier. The present invention also provides kits comprising pharmaceutical compositions 
of the present invention. 

25 In another aspect, the present invention provides methods for screening anti-viral 

agents that inhibit the infectivity or replication of hSARS virus or variants thereof. 

5.1 Recombinant and Chimeric hSARS Viruses 

The present invention encompasses recombinant or chimeric viruses encoded by 
30 viral vectors derived from the genome of hSARS virus or natural variants thereof. In a 

specific embodiment, a recombinant virus is one derived from the hS ARS virus of deposit 
accession no. CCTCO-V200303. In a specific embodiment, the virus has a nucleotide 
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sequence of SEQ ID NO: 15. In another specific embodiment, a recombinant virus is one 
derived from a natural variant of hSARS virus. A natural variant of hSARS has a sequence 
that is different from the genomic sequence (SEQ ID NO:15) of the hSARS virus, CCTCC- 
V200303, due to one or more naturally occurred mutations, including, but not limited to, 
5 point mutations, rearrangements, insertions, deletions etc., to the genomic sequence that 
may or may not result in a phenotypic change. In accordance with the present invention, a 
viral vector which is derived from the genome of the hSARS virus, CCTCC-V200303, is 
one that contains a nucleic acid sequence that encodes at least a part of one ORF of the 
hSARS virus. In a specific embodiment, the ORF comprises or consists of a nucleotide 

10 sequence ofSEQ ID NO: 1, 11, 13, 2471, 2473, or a fragment thereof. In a specific 
embodiment, there are more than one ORF within the nucleotide sequence of SEQ ID 
NO:15, as shown in Figures 11 (SEQ ID NOS:16, 240 and 737) and 12 (SEQ ID NOS:1108, 
1590 and 1965), or a fragment thereof. In another embodiment, the polypeptide encoded by 
the ORF comprises or consists of an amino acid sequence of SEQ ID NO:2, 12, 14, 2472, 

15 2474, or a fragment thereof, or shown in Figures 1 1 (SEQ ID NOS: 17-239, 241-736 and 
738-1107) and 12 (SEQ ID NOS:l 109-1589, 1591-1964 and 1966-2470), or a fragment 
thereof. In accordance with the present invention these viral vectors may or may not 
include nucleic acids that are non-native to the viral genome. 

In another specific embodiment, a chimeric virus of the invention is a recombinant 

20 hS ARS virus which further comprises a heterologous nucleotide sequence. In accordance 
with the invention, a chimeric virus may be encoded by a nucleotide sequence in which 
heterologous nucleotide sequences have been added to the genome or in which endogenous 
or native nucleotide sequences have been replaced with heterologous nucleotide sequences. 
According to the present invention, the chimeric viruses are encoded by the viral 

25 vectors of the invention which further comprise a heterologous nucleotide sequence. In 

accordance with the present invention a chimeric virus is encoded by a viral vector that may 
or may not include nucleic acids that are non-native to the viral genome. In accordance 
with the invention a chimeric virus is encoded by a viral vector to which heterologous 
nucleotide sequences have been added, inserted or substituted for native or non-native 

30 sequences. In accordance with the present invention, the chimeric virus may be encoded by 
nucleotide sequences derived from different strains or variants of hSARS virus. In 
particular, the chimeric virus is encoded by nucleotide sequences that encode antigenic 
polypeptides derived from different strains or variants of hSARS virus. 
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A chimeric virus may be of particular use for the generation of recombinant vaccines 
protecting against two or more viruses (Tao et al, J. Virol. 72, 2955-2961; Durbin et al., 
2000, J.Virol. 74, 6821-6831; Skiadopoulos etal., 1998, J. Virol. 72, 1762-1768 (1998); 
Teng et aL, 2000, J. Virol. 74, 93 17-9321). For example, it can be envisaged that a virus 
5 vector derived from the hSARS virus expressing one or more proteins of variants of hSARS 
virus, or vice versa, will protect a subject vaccinated with such vector against infections by 
both the native hSARS and the variant. Attenuated and replication-defective viruses may be 
of use for vaccination purposes with live vaccines as has been suggested for other viruses. 
(See, PCT WO 02/057302, at pp. 6 and 23, incorporated by reference herein). 

10 In accordance with the present invention the heterologous sequence to be 

incorporated into the viral vectors encoding the recombinant or chimeric viruses of the 
invention include sequences obtained or derived from different strains or variants of hSARS. 

In certain embodiments, the chimeric or recombinant viruses of the invention are 
encoded by viral vectors derived from viral genomes wherein one or more sequences, 

15 intergenic regions, termini sequences, or portions or entire ORF have been substituted with 
a heterologous or non-native sequence. In certain embodiments of the invention, the 
chimeric viruses of the invention are encoded by viral vectors derived from viral genomes 
wherein one or more heterologous sequences have been inserted or added to the vector. 

The selection of the viral vector may depend on the species of the subject that is to 

20 be treated or protected from a viral infection. If the subject is human, then an attenuated 
hS ARS virus can be used to provide the antigenic sequences. 

In accordance with the present invention, the viral vectors can be engineered to 
provide antigenic sequences which confer protection against infection by the hSARS and 
natural variants thereof. The viral vectors may be engineered to provide one, two, three or 

25 more antigenic sequences. In accordance with the present invention the antigenic sequences 
may be derived from the same virus, from different strains or variants of the same type of 
virus, or from different viruses. 

The expression products and/or recombinant or chimeric virions obtained in 
accordance with the invention may advantageously be utilized in vaccine formulations. The 

30 expression products and chimeric virions of the present invention may be engineered to 
create vaccines against a broad range of pathogens, including viral and bacterial antigens, 
tumor antigens, allergen antigens, and auto antigens involved in autoimmune disorders. In 
particular, the chimeric virions of the present invention may be engineered to create 
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vaccines for the protection of a subject from infections with hSARS virus and variants 
thereof 

In certain embodiments, the expression products and recombinant or chimeric 
virions of the present invention may be engineered to create vaccines against a broad range 
5 of pathogens, including viral antigens, tumor antigens and axito antigens involved in 

autoimmune disorders. One way to achieve this goal involves modifying existing hSARS 
genes to contain foreign sequences in their respective external domains. Where the 
heterologous sequences are epitopes or antigens of pathogens, these chimeric viruses may 
be used to induce a protective immune response against the disease agent from which these 

10 determinants are derived. 

Thus, the present invention relates to the use of viral vectors and recombinant or 
chimeric viruses to formulate vaccines against a broad range of viruses and/or antigens. 
The present invention also encompasses recombinant viruses comprising a viral vector 
derived from the hS ARS or variants thereof which contains sequences which result in a 

15 virus having a phenotype more suitable for use in vaccine formulations, e.g., attenuated 
phenotype or enhanced antigenicity. The mutations and modifications can be in coding 
regions, in intergenic regions and in the leader and trailer sequences of the virus. 

The invention provides a host cell comprising a nucleic acid or a vector according to 
the invention. Plasmid or viral vectors containing the polymerase components of hSARS 

20 virus are generated in prokaryotic cells for the expression of the components in relevant cell 
types (bacteria, insect cells, eukaryotic cells). Plasmid or viral vectors containing 
full-length or partial copies of the hSARS genome will be generated in prokaryotic cells for 
the expression of viral nucleic acids in-vitro or in-vivo. The latter vectors may contain 
other viral sequences for the generation of chimeric viruses or chimeric virus proteins, may 

25 lack parts of the viral genome for the generation of replication defective virus, and may 
contain mutations, deletions or insertions for the generation of attenuated viruses. In 
addition, the present invention provides a host cell infected with hS ARS virus, for example, 
of deposit no. CCTCC-V200303. 

Infectious copies of hSARS (being wild type, attenuated, replication-defective or 

30 chimeric) can be produced upon co-expression of the polymerase components according to 
the state-of-the-art technologies described above. 

In addition, eukaryotic cells, transiently or stably expressing one or more full-length 
or partial hSARS proteins can be used. Such cells can be made by transfection (proteins or 
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nucleic acid vectors), infection (viral vectors) or transduction (viral vectors) and may be 
useful for complementation of mentioned wild type, attenuated, replication-defective or 
chimeric viruses. 

The viral vectors and chimeric viruses of the present invention may be used to 
5 modulate a subject's immune system by stimulating a humoral immune response, a cellular 
immune response or by stimulating tolerance to an antigen. As used herein, a subject means: 
humans, primates, horses, cows, sheep, pigs, goats, dogs, cats, avian species and rodents. 

5.2 Formulation of Vaccines and Antivirals 

10 In a preferred embodiment, the invention provides a proteinaceous molecule or 

hSARS virus specific viral protein or functional fragment thereof encoded by a nucleic acid 
according to the invention. Useful proteinaceous molecules are for example derived from 
any of the genes or genomic fragments derivable from the virus according to the invention, 
including envelop protein (E protein), integral membrane protein (M protein), spike protein 

15 (S protein), nucleocapsid protein (N protein), hemaglutinin esterase (HE protein), and RNA- 
dependent RNA polymerase. Such molecules, or antigenic fragments thereof, as provided 
herein, are for example useful in diagnostic methods or kits and in pharmaceutical 
compositions such as subunit vaccines. Particularly useful are polypeptides encoded by the 
nucleotide sequence of SEQ ID NO: 1, 1 1, 13, 15, 2471, 2473, or as shown in Fig. 1 1 (SEQ 

20 ID NOS: 17-239, 241-736 and 738-1 107) and Fig. 12 (SEQ ID NOS: 1109-1589, 1591-1964 
and 1966-2470), or having the amino acid sequence of SEQ ID NO:2472 or 2474, or 
antigenic fragments thereof for inclusion as antigen or subunit immunogen, but inactivated 
whole virus can also be used. Particularly useful are also those proteinaceous substances 
that are encoded by recombinant nucleic acid fragments of the hSARS genome, of course 

25 preferred are those that are within the preferred bounds and metes of ORFs, in particular, for 
eliciting hSARS specific antibody or T cell responses, whether in vivo (e.g. for protective or 
therapeutic purposes or for providing diagnostic antibodies) or in vitro (e.g. by phage 
display technology or another technique useful for generating synthetic antibodies). 

The invention provides vaccine formulations for the prevention and treatment of 

30 infections with hSARS virus. In certain embodiments, the vaccine of the invention 

comprises recombinant and chimeric viruses of the hSARS virus. In certain embodiments, 
the virus is attenuated. 
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In another embodiment of this aspect of the invention, inactivated vaccine 
formulations may be prepared using conventional techniques to "kill" the chimeric viruses. 
Inactivated vaccines are "dead" in the sense that their infectivity has been destroyed. 
Ideally, the infectivity of the virus is destroyed without affecting its immunogenicity. In 
5 order to prepare inactivated vaccines, the chimeric virus may be grown in cell culture or in 
the allantois of the chick embryo, purified by zonal ultracentrifugation, inactivated by 
formaldehyde or P-propiolactone, and pooled. The resulting vaccine is usually inoculated 
intramuscularly. 

Inactivated viruses may be formulated with a suitable adjuvant in order to enhance 

1 0 the immunological response. Such adjuvants may include but are not limited to mineral 
gels, e.g., aluminum hydroxide; surface active substances such as lysolecithin, pluronic 
polyols, polyanions; peptides; oil emulsions; and potentially useful human adjuvants such as 
BCG and Corynebacterium parvum. 

In another aspect, the present invention also provides DNA vaccine formulations 

15 comprising a nucleic acid or fragment of the hSARS virus, e.g., the virus having accession 
no. CCTCC-V200303, or nucleic acid molecules having the sequence of SEQ ID NO:l, 11, 
13, 15, 2471, 2473, or a fragment thereof. In another specific embodiment, the DNA 
vaccine formulations of the present invention comprises a nucleic acid or fragment thereof 
encoding the antibodies which immunospecifically binds hSARS viruses. In DNA vaccine 

20 formulations, a vaccine DNA comprises a viral vector, such as that derived from the hS ARS 
virus, bacterial plasmid, or other expression vector, bearing an insert comprising a nucleic 
acid molecule of the present invention operably linked to one or more control elements, 
thereby allowing expression of the vaccinating proteins encoded by said nucleic acid 
molecule in a vaccinated subject. Such vectors can be prepared by recombinant DNA 

25 technology as recombinant or chimeric viral vectors carrying a nucleic acid molecule of the 
present invention (see also Section 5.1, supra). 

Various heterologous vectors are described for DNA vaccinations against viral 
infections. For example, the vectors described in the following references may be used to 
express hSARS sequences instead of the sequences of the viruses or other pathogens 

30 described; in particular, vectors described for hepatitis B virus (Michel, M.L. et ah, 1995, 
DAN-mediated immunization to the hepatitis B surface antigen in mice: Aspects of the 
humoral response mimic hepatitis B viral infection in humans, Proa Natl. Aca. Sci. USA 
92:5307-5311; Davis, H.L. etal, 1993, DNA-based immunization induces continuous 
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seretion of hepatitis B surface antigen and high levels of circulating antibody, Human Molec. 
Genetics 2: 1847-1851), HIV virus (Wang, B. etal, 1993, Gene inoculation generates 
immune responses against human imunodeficiency virus type 1, Proc. Natl Acad Set USA 
90:4156-4160; Lu, S. et aL 9 1996, Simian immunodeficiency virus DNA vaccine trial in 
5 macques, J. Virol 70:3978-3991; Letvin, NX. etal, 1997, Potent, protective anti-HXV 

immune responses generated by bimodal HIV envelope DNA plus protein vaccination, Proc 
Natl Acad Sci USA. 94(17):9378-83) ? and influenza viruses (Robinson, HI etaL, 1993, 
Protection against a lethal influenza virus challenge by immunization with a 
haemagglutinin-expressing plasmid DNA, Vaccine 11:957-960; Ulmer, IB, etal, 

1 0 Heterologous protection against influenza by injection of DNA encoding a viral protein, 
Science 259:1745-1749), as well as bacterial infections, such as tuberculosis (Tascon, R.E. 
et al, 1996, Vaccination against tuberculosis by DNA injection, Nature Med. 2:888-892; 
Huygen, K. et al, 1996; Immunogenicity and protective efficacy of a tuberculosis DNA 
vaccine, Nature Med. , 2:893-898), and parasitic infection, such as malaria (Sedegah, M., 

1 5 1994, Protection against malaria by immunization with plasmid DNA encoding 

circumsporozoite protein, Proc. Natl Acad. Sci, USA 91:9866-9870; Doolan, D.L. etal, 
1996, Circumventing genetic restriction of protection against malaria with multigene DNA 
immunization: CD 8+ T cell-interferon 8, and nitric oxide-dependent immunity, J, Exper. 
Med, 1183:1739-1746). 

20 Many methods may be used to introduce the vaccine formulations described above. 

These include, but are not limited to, oral, intradermal, intramuscular, intraperitoneal, 
intravenous, subcutaneous, and intranasal routes. Alternatively, it may be preferable to 
introduce the chimeric virus vaccine formulation via the natural route of infection of the 
pathogen for which the vaccine is designed. The DNA vaccines of the present invention 

25 may be administered in saline solutions by injections into muscle or skin using a syringe 

and needle (Wolff J, A. et al, 1990., Direct gene transfer into mouse muscle in vivo, Science 
247:1465-1468; Raz, E., 1994, Intradermal gene immunization: The possible role of DNA 
uptake in the induction of cellular immunity to viruses, Proc. Natl Acd. Sci. USA 91:9519- 
9523). Another way to administer DNA vaccines is called "gene gun" method, whereby 

30 microscopic gold beads coated with the DNA molecules of interest is fired into the cells 
(Tang, D. et al, 1992, Genetic immunization is a simple method for eliciting an immune 
response. Nature 356:152-154). For general reviews of the methods for DNA vaccines, see 
Robinson, H.L., 1999, DNA vaccines: basic mechanism and immune responses (Review), 
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Int. J. Mol Med. 4(5):549-555; Barber, B., 1991, Introduction: Emerging vaccine strategies, 
Seminars in Immunology 9(5):269-270; and Robinson, H.L. et al, 1997, DNA vaccines, 
Seminars in Immunology? 9(5):27 1-283. 

5 53 Attenuation of hSARS Virus or Variants Thereof 

The hS ARS virus or variants thereof of the invention can be genetically engineered 
to exhibit an attenuated phenotype. In particular, the viruses of the invention exhibit an 
attenuated phenotype in a subject to which the virus is administered as a vaccine. 
Attenuation can be achieved by any method known to a skilled artisan. Without being 

1 0 bound by theory, the attenuated phenotype of the viruses of the invention can be caused, e.g., 
by using a virus that naturally does not replicate well in an intended host species, for 
example, by reduced replication of the viral genome, by reduced ability of the virus to infect 
a host cell, or by reduced ability of the viral proteins to assemble to an infectious viral 
particle relative to the wild type strain of the virus. 

15 The attenuated phenotypes of hSARS virus or variants thereof can be tested by any 

method known to the artisan. A candidate virus can, for example, be tested for its ability to 
infect a host or for the rate of replication in a cell culture system. In certain embodiments, 
growth curves at different temperatures are used to test the attenuated phenotype of the 
virus. For example, an attenuated virus is able to grow at 35°C, but not at 39°C or 40°C. In 

20 certain embodiments, different cell lines can be used to evaluate the attenuated phenotype of 
the virus. For example, an attenuated virus may only be able to grow in monkey cell lines 
but not the human cell lines, or the achievable virus titers in different cell lines are different 
for the attenuated virus. In certain embodiments, viral replication in the respiratory tract of 
a small animal model, including but not limited to, hamsters, cotton rats, mice and guinea 

25 pigs, is used to evaluate the attenuated phenotypes of the virus. In other embodiments, the 
immune response induced by the virus, including but not limited to, the antibody titers (e.g., 
assayed by plaque reduction neutralization assay or ELISA) is used to evaluate the 
attenuated phenotypes of the virus. In a specific embodiment, the plaque reduction 
neutralization assay or ELISA is carried out at a low dose. In certain embodiments, the 

30 ability of the hSARS virus to elicit pathological symptoms in an animal model can be tested. 
A reduced ability of the virus to elicit pathological symptoms in an animal model system is 
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indicative of its attenuated phenotype. In a specific embodiment, the candidate viruses are 
tested in a monkey model for nasal infection, indicated by mucous production. 

The viruses of the invention can be attenuated such that one or more of the 
functional characteristics of the virus are impaired. In certain embodiments, attenuation is 
5 measured in comparison to the wild type strain of the virus from which the attenuated virus 
is derived. In other embodiments, attenuation is determined by comparing the growth of an 
attenuated virus in different host systems. Thus, for a non-limiting example, hSARS virus 
or a variant thereof is said to be attenuated when grown in a human host if the growth of the 
hSARS or variant thereof in the human host is reduced compared to the non-attenuated 

1 0 hS ARS or variant thereof 

In certain embodiments, the attenuated virus of the invention is capable of infecting 
a host, is capable of replicating in a host such that infectious viral particles are produced. In 
comparison to the wild type strain, however, the attenuated strain grows to lower titers or 
grows more slowly. Any technique known to the skilled artisan can be used to determine 

1 5 the growth curve of the attenuated virus and compare it to the growth curve of the wild type 
virus. 

In certain embodiments, the attenuated virus of the invention (e.g., a recombinant or 
chimeric hSARS) cannot replicate in human cells as well as the wild type virus (e.g., wild 
type hSARS) does. However, the attenuated virus can replicate well in a cell line that lack 

20 interferon functions, such as Vero cells. 

In other embodiments, the attenuated virus of the invention is capable of infecting a 
host, of replicating in the host, and of causing proteins of the virus of the invention to be 
inserted into the cytoplasmic membrane, but the attenuated virus does not cause the host to 
produce new infectious viral particles. In certain embodiments, the attenuated virus infects 

25 the host, replicates in the host, and causes viral proteins to be inserted in the cytoplasmic 
membrane of the host with the same efficiency as the wild type hSARS. In other 
embodiments, the ability of the attenuated virus to cause viral proteins to be inserted into 
the cytoplasmic membrane into the host cell is reduced compared to the wild type virus. In 
certain embodiments, the ability of the attenuated hSARS virus to replicate in the host is 

30 reduced compared to the wild type virus. Any technique known to the skilled artisan can be 
used to determine whether a virus is capable of infecting a mammalian cell, of replicating 
within the host, and of causing viral proteins to be inserted into the cytoplasmic membrane 
of the host. 
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In certain embodiments, the attenuated virus of the invention is capable of infecting 
a host. In contrast to the wild type hS ARS, however, the attenuated hSARS cannot be 
replicated in the host. In a specific embodiment, the attenuated hSARS virus can infect a 
host and can cause the host to insert viral proteins in its cytoplasmic membranes, but the 
5 attenuated virus is incapable of being replicated in the host. Any method known to the 
skilled artisan can be used to test whether the attenuated hSARS has infected the host and 
has caused the host to insert viral proteins in its cytoplasmic membranes. 

In certain embodiments, the ability of the attenuated virus to infect a host is reduced 
compared to the ability of the wild type virus to infect the same host. Any technique known 

10 to the skilled artisan can be used to determine whether a virus is capable of infecting a host. 

In certain embodiments, mutations {e.g., missense mutations) are introduced into the 
genome of the virus, for example, into the sequence of SEQ ID NO.l, 11, 13, 15, 2471 or 
2473, or to generate a virus with an attenuated phenotype. Mutations {e.g., missense 
mutations) can be introduced into the structural genes and/or regulatory genes of the hSARS. 

15 Mutations can be additions, substitutions, deletions, or combinations thereof. Such variant 
of hSARS can be screened for a predicted functionality, such as infectivity, replication 
ability, protein synthesis ability, assembling ability, as well as cytopathic effect in cell 
cultures. In a specific embodiment, the missense mutation is a cold-sensitive mutation. In 
another embodiment, the missense mutation is a heat-sensitive mutation. In another 

20 embodiment, the missense mutation prevents a normal processing or cleavage of the viral 
proteins. 

In other embodiments, deletions are introduced into the genome of the hSARS virus, 
which result in the attenuation of the virus. 

In certain embodiments, attenuation of the virus is achieved by replacing a gene of 
25 the wild type virus with a gene of a virus of a different species, of a different subgroup, or 
of a different variant. In another aspect, attenuation of the virus is achieved by replacing 
one or more specific domains of a protein of the wild type virus with domains derived from 
the corresponding protein of a virus of a different species. In certain other embodiments, 
attenuation of the virus is achieved by deleting one or more specific domains of a protein of 
30 the wild type virus. 

When a live attenuated vaccine is used, its safety must also be considered. The 
vaccine must not cause disease. Any techniques known in the art that can make a vaccine 
safe may be used in the present invention. In addition to attenuation techniques, other 
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techniques may be used. One non-limiting example is to use a soluble heterologous gene 
that cannot be incorporated into the virion membrane. For example, a single copy of the 
soluble version of a viral transmembrane protein lacking the transmembrane and cytosolic 
domains thereof, can be used, 
5 Various assays can be used to test the safety of a vaccine. For example, sucrose 

gradients and neutralization assays can be used to test the safety. A sucrose gradient assay 
can be used to determine whether a heterologous protein is inserted in a virion. If the 
heterologous protein is inserted in the virion, the virion should be tested for its ability to 
cause symptoms in an appropriate animal model since the virus may have acquired new, 
10 possibly pathological, properties. 

5.4 Adjuvants and Carrier Molecules 

hSARS-associated antigens are administered with one or more adjuvants. In one 
embodiment, the hSARS-associated antigen is administered together with a mineral salt 
1 5 adjuvants or mineral salt gel adjuvant. Such mineral salt and mineral salt gel adjuvants 
include, but are not limited to, aluminum hydroxide (ALHYDROGEL, REHYDRAGEL), 
aluminum phosphate gel, aluminum hydroxyphosphate (ADJU-PHOS), and calcium 
phosphate. 

In another embodiment, hSARS-associated antigen is administered with an 
20 immunostimulatory adjuvant. Such class of adjuvants, include, but are not limited to, 

cytokines (e.g., interleukin-2, interleukin-7, interleukin-12, granulocyte-macrophage colony 
stimulating factor (GM-CSF), interfereon-y interleukin-ip (IL-1(3), and EL- 1(5 peptide or 
Sclavo Peptide), cytokine-containing liposomes, triterpenoid glycosides or saponins (e.g., 
QuilA and QS-21, also sold under the trademark STIMULON, ISCOPREP), Muramyl 
25 Dipeptide (MDP) derivatives, such as N-acetyl-muramyl-L-threonyl-D-isoglutamine 
(Threonyl-MDP, sold under the trademark TERMURTIDE), GMDP, N-acetyl-nor- 
muramyl-L-alanyl-D-isoglutamine, N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine- 
2-(i'-2-dipalmitoyl-sn-glycero-3 -hydroxy phosphoryloxy)-ethylamine, muramyl tripeptide 
phosphatidylethanolamine (MTP-PE), unmethylated CpG dinucleotides and 
30 oligonucleotides, such as bacterial DNA and fragments thereof, LPS, monophosphoryl 
Lipid A (3D -ML A sold under the trademark MPL), and polyphosphazenes. 
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In another embodiment, the adjuvant used is a particular adjuvant, including, but not 
limited to, emulsions, e.g., Freund's Complete Adjuvant, Freund's Incomplete Adjuvant, 
squalene or squalane oil-in-water adjuvant formulations, such as SAF and MF59, e.g., 
prepared with block-copolymers, such as L-121 (polyoxypropylene/polyoxyetheylene) sold 
5 under the trademark PLURONIC L-121, Liposomes, Virosomes, cochleates, and immune 
stimulating complex, which is sold under the trademark ISCOM. 

In another embedment, a microparticular adjuvant is used., Microparticulare 
adjuvants include, but are not limited to biodegradable and biocompatible polyesters, homo- 
and copolymers of lactic acid (PLA) and gly colic acid (PGA), poly(lactide-co-glycolides) 

10 (PLGA) microparticles, polymers that self-associate into particulates (poloxamer particles), 
soluble polymers (polyphosphazenes), and virus-like particles (VLPs) such as recombinant 
protein particulates, e.g., hepatitis B surface antigen (HbsAg). 

Yet another class of adjuvants that may be used include mucosal adjuvants, 
including but not limited to heat-labile enterotoxin from Escherichia coli (LT), cholera 

15 holotoxin (CT) and cholera Toxin B Subunit (CTB) from Vibrio cholerae, mutant toxins 
(e.g., LTK63 and LTR72), microp articles, and polymerized liposomes. 

In other embodiments, any of the above classes of adjuvants may be used in 
combination with each other or with other adjuvants. For example, non-limiting examples 
of combination adjuvant preparations that can be used to administer the hSARS-associated 

20 antigens of the invention include liposomes containing immunostimulatory protein, 

cytokines, or T-cell and/or B-cell peptides, or microbes with or without entrapped IL-2 or 
microparticles containing enterotoxin. Other adjuvants known in the art are also included 
within the scope of the invention (see Vaccine Design: The Subunit and Adjuvant Approach, 
Chap. 7, Michael F. Powell and Mark J. Newman (eds.), Plenum Press, New York, 1995, 

25 which is incorporated herein in its entirety). 

The effectiveness of an adjuvant may be determined by measuring the induction of 
antibodies directed against an immunogenic polypeptide containing a hSARS polypeptide 
epitope, the antibodies resulting from administration of this polypeptide in vaccines which 
are also comprised of the various adjuvants. 

30 The polypeptides may be formulated into the vaccine as neutral or salt forms. 

Pharmaceutically acceptable salts include the acid additional salts (formed with free amino 
groups of the peptide) and which are formed with inorganic acids, such as, for example, 
hydrochloric or phosphoric acids, or organic acids such as acetic, oxalic, tartaric, maleic, 
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and the like. Salts formed with free carboxyl groups may also be derived from inorganic 
bases, such as, for example, sodium potassium, ammonium, calcium, or ferric hydroxides, 
and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine, 
procaine and the like. 

5 The vaccines of the invention may be multivalent or univalent. Multivalent vaccines 

are made from recombinant viruses that direct the expression of more than one antigen. 

Many methods may be used to introduce the vaccine formulations of the invention; 
these include but are not limited to oral, intradermal, intramuscular, intraperitoneal, 
intravenous, subcutaneous, intranasal routes, and via scarification (scratching through the 
10 top layers of skin, e.g., using a bifurcated needle). 

The patient to which the vaccine is administered is preferably a mammal, most 
preferably a human, but can also be a non-human animal including but not limited to cows, 
horses, sheep, pigs, fowl {e.g., chickens), goats, cats, dogs, hamsters, mice and rats. 

15 5.5 Preparation of Antibodies 

Antibodies which specifically recognize a polypeptide of the invention, such as, but 
not limited to, polypeptides comprising the sequence of SEQ ID NO:2, 12, 14, 2472, 2474, 
and polypeptides as shown in Figures 1 1 (SEQ ID NOS: 17-239, 241-736 and 738-1 107) 
and 12 (SEQ ID NOS: 1109-1589, 1591-1964 and 1966-2470), or hSARS epitope or 

20 antigen-binding fragments thereof can be used for detecting, screening, and isolating the 
polypeptide of the invention or fragments thereof, or similar sequences that might encode 
similar enzymes from the other organisms. For example, in one specific embodiment, an 
antibody which immunospecifically binds hSARS epitope, or a fragment thereof, can be 
used for various in vitro detection assays, including enzyme-linked immunosorbent assays 

25 (ELISA), radioimmunoassays, Western blot, etc., for the detection of a polypeptide of the 
invention or, preferably, hSARS, in samples, for example, a biological material, including 
cells, cell culture media (e.g., bacterial cell culture media, mammalian cell culture media, 
insect cell culture media, yeast cell culture media, etc.), blood, plasma, serum, tissues, 
sputum, naseopharyngeal aspirates, etc. 

30 Antibodies specific for a polypeptide of the invention or any epitope of hSARS may 

be generated by any suitable method known in the art. Polyclonal antibodies to an antigen- 
of-interest, for example, the hSARS virus from deposit no. CCTCC- V2003 03 , or comprises 
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a nucleotide sequence of SEQ ID NO: 15, can be produced by various procedures well 
known in the art. For example, an antigen can be administered to various host animals 
including, but not limited to, rabbits, mice, rats, etc., to induce the production of antisera 
containing polyclonal antibodies specific for the antigen. Various adjuvants may be used to 
5 increase the immunological response, depending on the host species, and include but are not 
limited to, Freund's (complete and incomplete) adjuvant, mineral gels such as aluminum 
hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, 
peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful 
adjuvants for humans such as BCG (Bacille Calmette-Guerin) and Corynebacterium parvum. 

1 0 Such adjuvants are also well known in the art. 

Monoclonal antibodies can be prepared using a wide variety of techniques known in 
the art including the use of hybridoma, recombinant, and phage display technologies, or a 
combination thereof. For example, monoclonal antibodies can be produced using 
hybridoma techniques including those known in the art and taught, for example, in Harlow 

15 et al., Antibodies: A Laboratory Manual, (Cold Spring Harbor Laboratory Press, 2nd ed. 

1988); Hammerling, et al., in: Monoclonal Antibodies and T-Cell Hybridomas, pp. 563-681 
(Elsevier, N.Y., 1981) (both of which are incorporated by reference in their entireties). The 
term "monoclonal antibody" as used herein is not limited to antibodies produced through 
hybridoma technology. The term "monoclonal antibody" refers to an antibody that is 

20 derived from a single clone, including any eukaryotic, prokaryotic, or phage clone, and not 
the method by which it is produced. 

Methods for producing and screening for specific antibodies using hybridoma 
technology are routine and well known in the art. In a non-limiting example, mice can be 
immunized with an antigen of interest or a cell expressing such an antigen. Once an 

25 immune response is detected, e.g., antibodies specific for the antigen are detected in the 

mouse serum, the mouse spleen is harvested and splenocytes isolated. The splenocytes are 
then fused by well known techniques to any suitable myeloma cells. Hybridomas are 
selected and cloned by limiting dilution. The hybridoma clones are then assayed by 
methods known in the art for cells that secrete antibodies capable of binding the antigen. 

30 Ascites fluid, which generally contains high levels of antibodies, can be generated by 
inoculating mice intr ap eritoneally with positive hybridoma clones. 

Antibody fragments which recognize specific epitopes may be generated by known 
techniques. For example, Fab and F(ab') 2 fragments may be produced by proteolytic 
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cleavage of immunoglobulin molecules, using enzymes such as papain (to produce Fab 
fragments) or pepsin (to produce F(ab ? ) 2 fragments). F(ab ? ) 2 fragments contain the 
complete light chain, and the variable region, the CHI region and the hinge region of the 
heavy chain. 

5 The antibodies of the invention or fragments thereof can be also produced by any 

method known in the art for the synthesis of antibodies, in particular, by chemical synthesis 
or preferabty, by recombinant expression techniques. 

The nucleotide sequence encoding an antibody may be obtained from any 
information available to those skilled in the art (i.e., from Genbank, the literature, or by 

10 routine cloning and sequence analysis). If a clone containing a nucleic acid encoding a 

particular antibody or an epitope-binding fragment thereof is not available, but the sequence 
of the antibody molecule or epitope-binding fragment thereof is known, a nucleic acid 
encoding the immunoglobulin may be chemically synthesized or obtained from a suitable 
source (e.g., an antibody cDNA library, or a cDNA library generated from, or nucleic acid, 

15 preferably poly A+ RNA, isolated from any tissue or cells expressing the antibody, such as 
hybridoma cells selected to express an antibody) by PGR amplification using synthetic 
primers hybridizable to the 3' and 5 3 ends of the sequence or by cloning using an 
oligonucleotide probe specific for the particular gene sequence to identify, e.g., a cDNA 
clone from a cDNA library that encodes the antibody. Amplified nucleic acids generated by 

20 PCR may then be cloned into replicable cloning vectors using any method well known in 
the art. 

Once the nucleotide sequence of the antibody is determined, the nucleotide sequence 
of the antibody may be manipulated using methods well known in the art for the 
manipulation of nucleotide sequences, e.g., recombinant DNA techniques, site directed 

25 mutagenesis, PCR, etc. (see, for example, the techniques described in Sambrook et al, supra; 
and Ausubel et al. ? eds., 1998, Current Protocols in Molecular Biology, John Wiley & Sons, 
NY, which are both incorporated by reference herein in their entireties), to generate 
antibodies having a different amino acid sequence by, for example, introducing amino acid 
substitutions, deletions, and/or insertions into the epitope-binding domain regions of the 

30 antibodies or any portion of antibodies which may enhance or reduce biological activities of 
the antibodies. 

Recombinant expression of an antibody requires construction of an expression 
vector containing a nucleotide sequence that encodes the antibody. Once a nucleotide 
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sequence encoding an antibody molecule or a heavy or light chain of an antibody, or portion 
thereof has been obtained, the vector for the production of the antibody molecule may be 
produced by recombinant DNA technology using techniques well known in the art as 
discussed in the previous sections. Methods which are well known to those skilled in the art 
5 can be used to construct expression vectors containing antibody coding sequences and 
appropriate transcriptional and translational control signals. These methods include, for 
example, in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 
recombination. The nucleotide sequence encoding the heavy-chain variable region, light- 
chain variable region, both the heavy-chain and light-chain variable regions, an epitope- 

1 0 binding fragment of the heavy- and/or light-chain variable region, or one or more 

complementarity determining regions (CDRs) of an antibody may be cloned into such a 
vector for expression. Thus-prepared expression vector can be then introduced into 
appropriate host cells for the expression of the antibody. Accordingly, the invention 
includes host cells containing a polynucleotide encoding an antibody specific for the 

1 5 polypeptides of the invention or fragments thereof. 

The host cell may be co-transfected with two expression vectors of the invention, the 
first vector encoding a heavy chain derived polypeptide and the second vector encoding a 
light chain derived polypeptide. The two vectors may contain identical selectable markers 
which enable equal expression of heavy and light chain polypeptides or different selectable 

20 markers to ensure maintenance of both plasmids. Alternatively, a single vector may be used 
which encodes, and is capable of expressing, both heavy and light chain polypeptides. In 
such situations, the light chain should be placed before the heavy chain to avoid an excess 
of toxic free heavy chain (Proudfoot, Nature, 322:52, 1986; and Kohler, Proc. Natl. Acad. 
Sci. USA, 77:2 197, 1980), The coding sequences for the heavy and light chains may 

25 comprise cDNA or genomic DNA. 

In another embodiment, antibodies can also be generated using various phage 
display methods known in the art. In phage display methods, functional antibody domains 
are displayed on the surface of phage particles which carry the polynucleotide sequences 
encoding them. In a particular embodiment, such phage can be utilized to display antigen 

30 binding domains, such as Fab and Fv or disulfide-bond stabilized Fv, expressed from a 

repertoire or combinatorial antibody library (e.g., human or murine). Phage expressing an 
antigen binding domain that binds the antigen of interest can be selected or identified with 
antigen, e.g., using labeled antigen or antigen bound or captured to a solid surface or bead. 
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Phage used in these methods are typically filamentous phage, including fd and Ml 3. The 
antigen binding domains are expressed as a recombinantly fused protein to either the phage 
gene III or gene VIII protein. Examples of phage display methods that can be used to make 
the immunoglobulins, or fragments thereof, of the present invention include those disclosed 
5 in Brinkman et al., J. Immunol. Methods, 182:41-50, 1995; Ames et al, I Immunol. 

Methods, 184:177-186, 1995; Kettleborough et al, Eur. J. Immunol, 24:952-958, 1994; 
Persic et al., Gene, 187:9-18, 1997; Burton et al., Advances in Immunology, 57:191-280, 
1994; PCT application No. PCT/GB9 1/01 134; PCT publications WO 90/02809; WO 
91/10737; WO 92/01047; WO 92/18619; WO 93/11236; WO 95/15982; WO 95/20401; and 

10 U.S. Patent Nos. 5,698,426; 5,223,409; 5,403,484; 5,580,717; 5,427,908; 5,750,753; 
5,821,047; 5,571,698; 5,427,908; 5,516,637; 5,780,225; 5,658,727; 5,733,743 and 
5,969,108; each of which is incorporated herein by reference in its entirety. 

As described in the above references, after phage selection, the antibody coding 
regions from the phage can be isolated and used to generate whole antibodies, including 

15 human antibodies, or any other desired fragments, and expressed in any desired host, 

including mammalian cells, insect cells, plant cells, yeast, and bacteria, e.g., as described in 
detail below. For example, techniques to recombinantly produce Fab, Fab' and F(ab / ) 2 
fragments can also be employed using methods known in the art such as those disclosed in 
PCT publication WO 92/22324; Mullinax et al., BioTechniques, 12(6):864-869, 1992; and 

20 Sawai et al, AJRI, 34:26-34, 1995; and Better et al., Science, 240:1041-1043, 1988 (each of 
which is incorporated by reference in its entirety). Examples of techniques which can be 
used to produce single-chain Fvs and antibodies include those described in U.S. Patent Nos. 
4,946,778 and 5,258,498; Huston et al., Methods in Enzymology, 203:46-88, 1991; Shu et 
al, PNAS, 90:7995-7999, 1993; and Skerra et al., Science, 240:1038-1040, 1988. 

25 Once an antibody molecule of the invention has been produced by any methods 

described above, it may then be purified by any method known in the art for purification of 
an immunoglobulin molecule, for example, by chromatography (e.g., ion exchange, affinity, 
particularly by affinity for the specific antigen after Protein A or Protein G purification, and 
sizing column chromatography), centrifugation, differential solubility, or by any other 

30 standard techniques for the purification of proteins. Further, the antibodies of the present 
invention or fragments thereof may be fused to heterologous polypeptide sequences 
described herein or otherwise known in the art to facilitate purification. 
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For some uses, including in vivo use of antibodies in humans and in vitro detection 
assays, it may be preferable to use chimeric, humanized, or human antibodies. A chimeric 
antibody is a molecule in which different portions of the antibody are derived from different 
animal species, such as antibodies having a variable region derived from a murine 
5 monoclonal antibody and a constant region derived from a human immunoglobulin. 
Methods for producing chimeric antibodies are known in the art. See e.g., Morrison, 
Science, 229:1202, 1985; Oi et al., BioTechniques, 4:214 1986; Gillies et al., J. Immunol. 
Methods, 125:191-202, 1989; U.S. Patent Nos. 5,807,715; 4,816,567; and 4,816,397, which 
are incorporated herein by reference in their entireties. Humanized antibodies are antibody 

10 molecules from non- human species that bind the desired antigen having one or more 

complementarity determining regions (CDRs) from the non-human species and framework 
regions from a human immunoglobulin molecule. Often, framework residues in the human 
framework regions will be substituted with the corresponding residue from the CDR donor 
antibody to alter, preferably improve, antigen binding. These framework substitutions are 

1 5 identified by methods well known in the art, e.g., by modeling of the interactions of the 

CDR and framework residues to identify framework residues important for antigen binding 
and sequence comparison to identify unusual framework residues at particular positions. 
See, e.g., Queen et al, U.S. Patent No. 5,585,089; Riechmann et al., Nature, 332:323, 1988,. 
which are incorporated herein by reference in their entireties. Antibodies can be humanized 

20 using a variety of techniques known in the art including, for example, CDR-grafting (EP 
239,400; PCT publication WO 91/09967; U.S. Patent Nos. 5,225,539; 5,530,101 and 
5,585,089), veneering or resurfacing (EP 592,106; EP 519,596; Padlan, Molecular 
Immunology, 28(4/5):489-498, 1991; Studnicka et al., Protein Engineering, 7(6): 805-8 14, 
1994; Roguska et al., Proc Natl. Acad. Sci. USA, 91:969-973, 1994), and chain shuffling 

25 (U.S. Patent No. 5,565,332), all of which are hereby incorporated by reference in their 
entireties. 

Completely human antibodies are particularly desirable for therapeutic treatment of 
human patients. Human antibodies can be made by a variety of methods known in the art 
including phage display methods described above using antibody libraries derived from 
30 human immunoglobulin sequences. See U.S. Patent Nos. 4,444,887 and 4,716,1 1 1; and 
PCT publications WO 98/46645; WO 98/50433; WO 98/24893; WO 98/16654; WO 
96/34096; WO 96/33735; and WO 91/10741, each of which is incorporated herein by 
reference in its entirety. 



44 



WO 2004/085650 



PCT/CN2004/000246 



Human antibodies can also be produced using transgenic mice which are incapable 
of expressing functional endogenous immunoglobulins, but which can express human 
immunoglobulin genes. For an overview of this technology for producing human 
antibodies, see Lonberg and Huszar, Int. Rev. Immunol., 13:65-93, 1995. For a detailed 
5 discussion of this technology for producing human antibodies and human monoclonal 
antibodies and protocols for producing such antibodies, see, e.g., PCT publications WO 
98/24893; WO 92/01047; WO 96/34096; WO 96/33735; European Patent No. 0 598 877; 
U.S. Patent Nos. 5,413,923; 5,625,126; 5,633,425; 5,569,825; 5,661,016; 5,545,806; 
5,814/318; 5,885,793; 5,916,771; and 5,939,598, which are incorporated by reference herein 

10 in their entireties. In addition, companies such as Abgenix, Inc. (Fremont, CA), Medarex 
(NJ) and Genpharm (San Jose, CA) can be engaged to provide human antibodies directed 
against a selected antigen using technology similar to that described above. 

Completely human antibodies which recognize a selected epitope can be generated 
using a technique referred to as "guided selection." In this approach a selected non-human 

15 monoclonal antibody, e.g., a mouse antibody, is used to guide the selection of a completely 
human antibody recognizing the same epitope. (Jespers et al., Bio/technology, 12:899-903, 
1988). 

Antibodies fused or conjugated to heterologous polypeptides may be used in in vitro 
immunoassays and in purification methods (e.g., affinity chromatography) well known in 

20 the art. See e.g., PCT publication Number WO 93/21232; EP 439,095; Naramura et al., 

Immunol. Lett, 39:91-99, 1994; U.S. Patent 5,474,981; Gillies et al., PNAS, 89:1428-1432, 
1992; and Fell et al, J. Immunol., 146:2446-2452, 1991, which are incorporated herein by 
reference in their entireties. 

Antibodies may also be attached to solid supports, which are particularly useful for 

25 immunoassays or purification of the polypeptides of the invention or fragments, derivatives, 
analogs, or variants thereof, or similar molecules having the similar enzymatic activities as 
the polypeptide of the invention. Such solid supports include, but are not limited to, glass, 
cellulose, poly aery lamide, nylon, polystyrene, polyvinyl chloride or polypropylene. 



30 5.6 Pharmaceutical Compositions and Kits 

The present invention encompasses pharmaceutical compositions comprising anti- 
viral agents of the present invention. In a specific embodiment, the anti-viral agent is an 
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antibody which immunospecifically binds and neutralize the hSARS virus or variants 
thereof, or any proteins derived therefrom. In another specific embodiment, the anti-viral 
agent is a polypeptide or nucleic acid molecule of the invention. The pharmaceutical 
compositions have utility as an anti-viral prophylactic agent and may be administered to a 
5 subject where the subject has been exposed or is expected to be exposed to a virus. 
Various delivery systems are known and can be used to administer the 
pharmaceutical composition of the invention, e.g., encapsulation in liposomes, 
microparticles, microcapsules, recombinant cells capable of expressing the mutant viruses, 
receptor mediated endocytosis (see, e.g., Wu and Wu, 1987, J. Biol. Chem. 262:4429 4432). 

10 Methods of introduction include but are not limited to intradermal, intramuscular, 
intraperitoneal, intravenous, subcutaneous, intranasal, epidural, and oral routes. The 
compounds may be administered by any convenient route, for example by infusion or bolus 
injection, by absorption through epithelial or mucocutaneous linings (e.g., oral mucosa, 
rectal and intestinal mucosa, etc.) and may be administered together with other biologically 

1 5 active agents. Administration can be systemic or local. In a preferred embodiment, it may 
be desirable to introduce the pharmaceutical compositions of the invention into the lungs by 
any suitable route. Pulmonary administration can also be employed, e.g., by use of ah 
inhaler or nebulizer, and formulation with an aerosolizing agent. 

In a specific embodiment, it may be desirable to administer the pharmaceutical 

20 compositions of the invention locally to the area in need of treatment; this may be achieved 
by, for example, and not by way of limitation, local infusion during surgery, topical 
application, e.g., in conjunction with a wound dressing after surgery, by injection, by means 
of a catheter, by means of a suppository, or by means of an implant, said implant being of a 
porous, non porous, or gelatinous material, including membranes, such as sialastic 

25 membranes, or fibers. In one embodiment, administration can be by direct injection at the 
site (or former site) infected tissues. 

In another embodiment, the pharmaceutical composition can be delivered in a 
vesicle, in particular a liposome (see Langer, 1990, Science 249:1527-1533; Treat et al., in 
Liposomes in the Therapy of Infectious Disease and Cancer, Lopez Berestein and Fidler 

30 (eds.), Liss, New York, pp. 353-365 (1989); Lopez-Berestein, ibid. , pp. 3 17-327; see 
generally ibid.). 

In yet another embodiment, the pharmaceutical composition can be delivered in a 
controlled release system. In one embodiment, a pump may be used (see Langer, supra; 
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Sefton, 1987, CRC Crit. Ref. Biomed. Eng. 14:201; Buchwald et al.,1980, Surgery 88:507; 
and Saudek et al., 1989, N. Engl. J. Med. 321:574). In another embodiment, polymeric 
materials can be used (see Medical Applications of Controlled Release, Langer and Wise 
(eds.), CRC Pres., Boca Raton, Florida (1974); Controlled Drug Bioavailability, Drug 
5 Product Design and Performance, Smolen and Ball (eds.), Wiley, New York (1984); Ranger 
and Peppas, J. MacromoL Sci. Rev. Macromol. Chem. 23:61 (1983); see also Levy et al., 
1985, Science 228:190; During et al., 1989, Ann. Neurol. 25:351; Howard et al., 1989, J. 
Neurosurg. 71 : 105). In yet another embodiment, a controlled release system can be placed 
in proximity of the composition's target, i.e., the lung, thus requiring only a fraction of the 

10 systemic dose (see, e.g., Goodson, in Medical Applications of Controlled Release, supra, 
vol. 2, pp. 115-138 (1984)). 

Other controlled release systems are discussed in the review by Langer (Science 
249:1527-1533 (1990)). 

The pharmaceutical compositions of the present invention comprise a 

15 therapeutically effective amount of an live attenuated, inactivated or killed hSARS virus, or 
recombinant or chimeric hS ARS virus, and a pharmaceutically acceptable carrier. In a 
specific embodiment, the term "pharmaceutically acceptable" means approved by a 
regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or 
other generally recognized pharmacopeia for use in animals, and more particularly in 

20 humans. The term "carrier" refers to a diluent, adjuvant, excipient, or vehicle with which 

the pharmaceutical composition is administered. Such pharmaceutical carriers can be sterile 
liquids, such as water and oils, including those of petroleum, animal, vegetable or synthetic 
origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like. Water is a 
preferred carrier when the pharmaceutical composition is administered intravenously. 

25 Saline solutions and aqueous dextrose and glycerol solutions can also be employed as liquid 
carriers, particularly for injectable solutions. Suitable pharmaceutical excipients include 
starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silica gel, sodium stearate, 
glycerol monostearate, talc, sodium chloride, dried skim milk, glycerol, propylene, glycol, 
water, ethanol and the like. The composition, if desired, can also contain minor amounts of 

30 wetting or emulsifying agents, or pH buffering agents. These compositions can take the 

form of solutions, suspensions, emulsion, tablets, pills, capsules, powders, sustained release 
formulations and the like. The composition can be formulated as a suppository, with 
traditional binders and carriers such as triglycerides. Oral formulation can include standard 



47 



WO 2004/085650 



PCT/CN2004/000246 



carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, 
sodium saccharine, cellulose, magnesium carbonate, etc. Examples of suitable 
pharmaceutical carriers are described in "Remington's Pharmaceutical Sciences" by E.W. 
Martin. The formulation should suit the mode of administration. 
5 In a preferred embodiment, the composition is formulated in accordance with 

routine procedures as a pharmaceutical composition adapted for intravenous administration 
to human beings. Typically, compositions for intravenous administration are solutions in 
sterile isotonic aqueous buffer. Where necessary, the composition may also include a 
solubilizing agent and a local anesthetic such as lignocaine to ease pain at the site of the 

10 injection. Generally, the ingredients are supplied either separately or mixed together in unit 
dosage form, for example, as a dry lyophilized powder or water free concentrate in a 
hermetically sealed container such as an ampoule or sachette indicating the quantity of 
active agent. Where the composition is to be administered by infusion, it can be dispensed 
with an infusion bottle containing sterile pharmaceutical grade water or saline. Where the 

15 composition is administered by injection, an ampoule of sterile water for injection or saline 
can be provided so that the ingredients may be mixed prior to administration. 

The pharmaceutical compositions of the invention can be formulated as neutral or 
salt forms. Pharmaceutically acceptable salts include those formed with free amino groups 
such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc., and 

20 those formed with free carboxyl groups such as those derived from sodium, potassium, 
ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 2 ethylamino 
ethanol, histidine, procaine, etc. 

The amount of the pharmaceutical composition of the invention which will be 
effective in the treatment of a particular disorder or condition will depend on the nature of 

25 the disorder or condition, and can be determined by standard clinical techniques. In 

addition, in vitro assays may optionally be employed to help identify optimal dosage ranges. 
The precise dose to be employed in the formulation will also depend on the route of 
administration, and the seriousness of the disease or disorder, and should be decided 
according to the judgment of the practitioner and each patient's circumstances. However, 

30 suitable dosage ranges for intravenous administration are generally about 20 500 

micrograms of active compound per kilogram body weight. Suitable dosage ranges for 
intranasal administration are generally about 0.01 pg/kg body weight to 1 mg/kg body 
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weight. Effective doses may be extrapolated from dose response curves derived from in 
vitro or animal model test systems. 

Suppositories generally contain active ingredient in the range of 0.5% to 10% by 
weight; oral formulations preferably contain 10% to 95% active ingredient. 
5 The invention also provides a pharmaceutical pack or kit comprising one or more 

containers filled with one or more of the ingredients of the pharmaceutical compositions of 
the invention. Optionally associated with such container(s) can be a notice in the form 
prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects approval by the agency of 

10 manufacture, use or sale for human administration. In a preferred embodiment, the kit 

contains an anti-viral agent of the invention, e.g., an antibody specific for the polypeptides 
encoded by a nucleotide sequence of SEQ ID NO:l, 11, 13, 15, 2471 or 2473, or as shown 
in Figures 11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 12 (SEQ ID NOS: 1109- 
1589, 1591-1964 and 1966-2470), or any hSARS epitope, or a polypeptide or protein of the 

1 5 present invention, or a nucleic acid molecule of the invention, alone or in combination with 
adjuvants, antivirals, antibiotics, analgesic, bronchodialaters, or other pharmaceutically 
acceptable excipients. 

The present invention further encompasses kits comprising a container containing a 
pharmaceutical composition of the present invention and instructions to for use. 

20 

5.7 Detection Assays 

The present invention provides a method for detecting an antibody, which 
immunospecifically binds to the hSARS virus, in a biological sample, for example blood, 
serum, plasma, saliva, urine, etc., from a patient suffering from SARS. In a specific 

25 embodiment, the method comprising contacting the sample with the hSARS virus, for 

example, of deposit no. CCTCC-V200303, or having a genomic nucleic acid sequence of 
SEQ ID NO: 15, directly immobilized on a substrate and detecting the virus-bound antibody 
directly or indirectly by a labeled heterologous anti-isotype antibody. In another specific 
embodiment, the sample is contacted with a host cell which is infected by the hSARS virus, 

30 for example, of deposit no. CCTCC-V200303, or having a genomic nucleic acid sequence 
of SEQ ID NO: 15, and the bound antibody can be detected by immunofluorescent assay as 
described in Section 6.5, infra. 
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An exemplary method for detecting the presence or absence of a polypeptide or 
nucleic acid of the invention in a biological sample involves obtaining a biological sample 
from various sources and contacting the sample with a compound or an agent capable of 
detecting an epitope or nucleic acid (e.g., mRNA, genomic RNA) of the hSARS virus such 
5 that the presence of the hSARS virus is detected in the sample. A preferred agent for 

detecting hSARS mRNA or genomic RNA of the invention is a labeled nucleic acid probe 
capable of hybridizing to mRNA or genomic RNA encoding a polypeptide of the invention. 
The nucleic acid probe can be, for example, a nucleic acid molecule comprising or 
consisting of the nucleotide sequence or SEQ ID NO: 1 ? 1 1, 13, 15, 2471, or 2473, or a 

10 portion thereof, such as an oligonucleotide of at least 15, 20, 25, 30, 50, 100, 250, 500, 750, 
1,000 or more contiguous nucleotides in length and sufficient to specifically hybridize 
under stringent conditions to a hSARS mRNA or genomic RNA. 

In another preferred specific embodiment, the presence of hSARS virus is detected 
in the sample by an reverse transcription polymerase chain reaction (RT-PCR.) using the 

15 primers that are constructed based on a partial nucleotide sequence of the genome of 
hSARS virus, for example, that of deposit accession no. CCTCC-V200303, or having a 
genomic nucleic acid sequence of SEQ ID NO: 15, or based on a nucleotide sequence of 
SEQ ID NO: 1, 11, 13, 2471 or 2473. In a non-limiting specific embodiment, preferred 
primers to be used in a RT-PCR method are: 5'-TACACACCTCAGC-GTTG-3 ' (SEQ ID 

20 NO:3) and 5 '-CACGAACGTGACG-AAT-3 7 (SEQ ID NO:4), in the presence of 2.5 mM 
MgCl 2 and the thermal cycles are, for example, but not limited to, 94 °C for 8 min followed 
by 40 cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min (also see Sections 6.7 
and 6.8 infra). In preferred embodiments, the primers comprise nucleic acid sequence of 
SEQ ID NOS:2475 and 2476, or SEQ ID NOS:2480 and 2481. In preferred embodiments, 

25 the thermal cycles are 94 °C for 10 min followed by 40 cycles of 94 °C for 30 seconds, 56 
°C for 30 seconds, 72 °C for 30 seconds, 72°C for 10 minutes. In preferred embodiments, 
the primers comprise nucleic acid sequence of SEQ ID NOS:2477 and 2478. In more 
preferred specific embodiment, the present invention provides a real-time quantitative PGR 
assay to detect the presence of hSARS virus in a biological sample by subjecting the cDNA 

30 obtained by reverse transcription of the extracted total RNA from the sample to PGR 

reactions using the specific primers, such as those having nucleotide sequences of SEQ ID 
NOS:3 and 4, and a fluorescence dye, such as SYBR® Green I, which fluoresces when 
bound non-specifically to double-stranded DNA. The fluorescence signals from these 
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reactions are captured at the end of extension steps as PCR product is generated over a 
range of the thermal cycles, thereby allowing the quantitative determination of the viral load 
in the sample based on an amplification plot (see Section 6.7, infra). 

A preferred agent for detecting hSARS is an antibody that specifically binds a 
5 polypeptide of the invention or any hSARS epitope, preferably an antibody with a 

detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact 
antibody, or a fragment thereof (e.g., Fab or F(ab') 2 ) can be used. 

The term "labeled", with regard to the probe or antibody, is intended to encompass 
direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable 

10 substance to the probe or antibody, as well as indirect labeling of the probe or antibody by 
reactivity with another reagent that is directly labeled. Examples of indirect labeling 
include detection of a primary antibody using a fluorescently labeled secondary antibody 
and end-labeling of a DNA probe with biotin such that it can be detected with fluorescently 
labeled streptavidin. The detection method of the invention can be used to detect niRNA, 

15 protein (or any epitope), or genomic RNA in a sample in vitro as well as in vivo. For 

example, in vitro techniques for detection of mRNA include northern hybridizations, in situ 
hybridizations, RT-PCR, and RNase protection. In vitro techniques for detection of an 
epitope of hSARS include enzyme linked immunosorbent assays (ELISAs), Western blots, 
immunoprecipitations and immunofluorescence. In vitro techniques for detection of 

20 genomic RNA include nothern hybridizations, RT-PCT, and RNase protection. 

Furthermore, in vivo techniques for detection of hSARS include introducing into a subject 
organism a labeled antibody directed against the polypeptide. For example, the antibody 
can be labeled with a radioactive marker whose presence and location in the subject 
organism can be detected by standard imaging techniques, including autoradiography. 

25 In a specific embodiment, the methods further involve obtaining a control sample 

from a control subject, contacting the control sample with a compound or agent capable of 
detecting hSARS, e.g., a polypeptide of the invention or mRNA or genomic RNA encoding 
a polypeptide of the invention, such that the presence of hSARS or the polypeptide or 
mRNA or genomic RNA encoding the polypeptide is detected in the sample, and comparing 

30 the presence of hSARS or the polypeptide or mRNA or genomic RNA encoding the 

polypeptide in the control sample with the presence of hSARS, or the polypeptide or mRNA 
or genomic DNA encoding the polypeptide in the test sample. 
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The invention also encompasses kits for detecting the presence of hSARS or a 
polypeptide or nucleic acid of the invention in a test sample. The kit, for example, can 
comprise a labeled compound or agent capable of detecting hSARS or the polypeptide or a 
nucleic acid molecule encoding the polypeptide in a test sample and, in certain 
5 embodiments, a means for determining the amount of the polypeptide or mRNA in the 
sample (e.g., an antibody which binds the polypeptide or an oligonucleotide probe which 
binds to DNA or mRNA encoding the polypeptide). Kits can also include instructions for 
use. 

For antibody-based kits, the kit can comprise, for example: (1) a first antibody (e.g., 
10 attached to a solid support) which binds to a polypeptide of the invention or hSARS epitope; 
and, optionally, (2) a second, different antibody which binds to either the polypeptide or the 
first antibody and is conjugated to a detectable agent. 

For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
15 acid sequence encoding a polypeptide of the invention or to a sequence within the hSARS 
genome or (2) a pair of primers useful for amplifying a nucleic acid molecule containing an 
hSARS sequence. The kit can also comprise, e.g., a buffering agent, a preservative, or a 
protein stabilizing agent. The kit can also comprise components necessary for detecting the 
detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample 
20 or a series of control samples which can be assayed and compared to the test sample 

contained. Each component of the kit is usually enclosed within an individual container and 
all of the various containers are within a single package along with instructions for use. 

5.8 Screening Assays to Identify Anti- Viral Agents 

25 The invention provides methods for the identification of a compound that inhibits 

the ability of hSARS virus to infect a host or a host cell. In certain embodiments, the 
invention provides methods for the identification of a compound that reduces the ability of 
hSARS virus to replicate in a host or a host cell. Any technique well-known to the skilled 
artisan can be used to screen for a compound that would abolish or reduce the ability of 

30 hSARS virus to infect a host and/or to replicate in a host or a host cell. 

In certain embodiments, the invention provides methods for the identification of a 
compound that inhibits the ability of hSARS virus to replicate in a mammal or a 
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mammalian cell. More specifically, the invention provides methods for the identification of 
a compound that inhibits the ability of hSARS virus to infect a mammal or a mammalian 
cell. In certain embodiments, the invention provides methods for the identification of a 
compound that inhibits the ability of hSARS virus to replicate in a mammalian cell. In a 
5 specific embodiment, the mammalian cell is a human cell. 

In another embodiment, a cell is contacted with a test compound and infected with 
the hSARS virus. In certain embodiments, a control culture is infected with the hSARS 
virus in the absence of a test compound. The cell can be contacted with a test compound 
before, concurrently with, or subsequent to the infection with the hS ARS virus. In a 

10 specific embodiment, the cell is a mammalian cell. In an even more specific embodiment, 
the cell is a human cell. In certain embodiments, the cell is incubated with the test 
compound for at least 1 minute, at least 5 minutes at least 15 minutes, at least 30 minutes, at 
least 1 hour, at least 2 hours, at least 5 hours, at least 12 hours, or at least 1 day. The titer of 
the virus can be measured at any time during the assay. In certain embodiments, a time 

1 5 course of viral growth in the culture is determined. If the viral growth is inhibited or 
reduced in the presence of the test compound, the test compound is identified as being 
effective in inhibiting or reducing the growth or infection of the hSARS virus. In a specific 
embodiment, the compound that inhibits or reduces the growth of the hSARS virus is tested 
for its ability to inhibit or reduce the growth rate of other viruses to test its specificity for the 

20 hSARS virus. 

In one embodiment, a test compound is administered to a model animal and the 
model animal is infected with the hSARS virus. In certain embodiments, a control model 
animal is infected with the hSARS virus without the administration of a test compound. 
The test compound can be administered before, concurrently with, or subsequent to the 

25 infection with the hSARS virus. In a specific embodiment, the model animal is a mammal. 
In an even more specific embodiment, the model animal can be, but is not limited to, a 
cotton rat, a mouse, or a monkey. The titer of the virus in the model animal can be 
measured at any time during the assay. In certain embodiments, a time course of viral 
growth in the culture is determined. If the viral growth is inhibited or reduced in the 

30 presence of the test compound, the test compound is identified as being effective in 
inhibiting or reducing the growth or infection of the hSARS virus. In a specific 
embodiment, the compound that inhibits or reduces the growth of the hS ARS in the model 
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animal is tested for its ability to inhibit or reduce the growth rate of other viruses to test its 
specificity for the hS ARS virus. 

6. EXAMPLES 

5 The following examples illustrate the isolation and identification of the novel 

hS ARS virus. These examples should not be construed as limiting. 

METHODS AWP RESULTS 

As a general reference, Wiedbrauk Dl & Johnston SLG. (Manual of Clinical 
10 Virology, Raven Press, New York, 1993) was used. 

6.1 Clinical Subjects 

The study included all 50 patients who fitted a modified World Health Organization 
(WHO) definition of SARS and were admitted to 2 acute regional hospitals in Hong Kong 

1 5 Special Administrative Region (HKSAR) between February 26 to March 26, 2003 (WHO. 
Severe acute respiratory syndrome (SARS) Weekly Epidemiol Rec. 2003; 78: 81-83). A 
lung biopsy from an additional patient, who had typical SARS and was admitted to a third 
hospital, was also included in the study. Briefly, the case definition for SARS was: (i) fever 
of 38°C or more; (ii) cough or shortness of breath; (iii) new pulmonary infiltrates on chest 

20 radiograph; and (iv) either a history of exposure to a patient with SARS or absence of 
response to empirical antimicrobial coverage for typical and atypical pneumonia (beta- 
lactams and macrolides, fluoroquinolones or tetracyclines). 

Nasopharyngeal aspirates and serum samples were collected from all patients. 
Paired acute and convalescent sera and feces were available from some patients. Lung 

25 biopsy tissue from one patient was processed for a viral culture, RT-PCR, routine 

histopathological examination, and electron microscopy. Nasopharyngeal aspirates, feces 
and sera submitted for microbiological investigation of other diseases were included in the 
study under blinding and served as controls. 

The medical records were reviewed retrospectively by the attending physicians and 

30 clinical microbiologists. Routine hematological, biochemical and microbiological 
examinations, including bacterial culture of blood and sputum, serological study and 
collection of nasopharyngeal aspirates for virological tests, were carried out. 
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6.2 Cell Line 

FRhK-4 (fetal rhesus monkey kidney) cells were maintained in minimal essential 
medium (MEM) with 1% fetal calf serum, 1% streptomycin and penicillin, 0.2% nystatin 
and 0.05% garamycin. 

6.3 Viral Infection 

Two-hundred |Ltl of clinical (nasopharyngeal aspirates) samples, from two patients 
(see the Result section, infra), in virus transport medium were used to infect FRhk-4 cells. 
The inoculated cells were incubated at 37°C for 1 hour. One ml of MEM containing 1 [ig 
trypsin was then added to the culture and the infected cells were incubated in a 37 D C 
incubator supplied with 5% carbon dioxide. Cytopathic effects were observed in the 
infected cells after 2 to 4 days of incubation. The infected cells were passaged into new 
FRhK-4 cells and cytopathic effects were observed within 1 day after the inoculation. The 
infected cells were tested by an immunofluorescent assay for influenza A., influenza B, 
respiratory syncytial virus, parainfluenza types 1, 2 and 3, adenovirus and human 
metapneumovirus (hMPV) and negative results were obtained for all cases. The infected 
cells were also tested by RT-PCR for influenza A and human metapneumovirus with 
negative results. 

6.4 Virus Morphology 

The infected cells prepared as described above were harvested, pelleted by 
centrifiigation and the cell pellets were processed for thin-section transmitted electron 
microscopic visualization. Viral particles were identified in the cells infected with both 
clinical specimens, but not in control cells which were not infected with the virus. Virions 
isolated from the infected cells were about 70-100 nanometers (Figure 2). Viral capsids 
were found predominantly within the vesicles of the golgi and endoplasmic reticulum and 
were not free in the cytoplasm. Virus particles were also found at the cell membrane. 

One virus isolate was ultracentrifuged and the cell pellet was negatively stained 
using phosphotugstic acid. Virus particles characteristic of Coronaviridae were thus 
visualized. Since the human Coronaviruses hitherto recognized are not known to cause a 
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similar disease, the present inventors postulated that the virus isolates represent a novel 
virus that infects humans. 

6.5 Antibody Response to the Isolated Virus 

To further confirm that this novel virus is responsible for causing S ARS in the 
infected patients, blood serum samples from the patients who were suffering from SAKS 
were obtained and a neutralization test was performed. Typically diluted serum (x50, x200 ? 
x800 and xl600) was incubated with acetone- fixed FRhK-4 cells infected with hSARS at 
37°C for 45 minutes. The incubated cells were then washed with phosphate-buffered saline 
and stained with anti-human IgG-FITC conjugated antibody. The cells were then washed 
and examined under a fluorescent microscope. In these experiments, positive signals were 
found in 8 patients who had SARS (Figure 3), indicating that these patients had an IgG 
antibody response to this novel human respiratory virus of Coronaviridae, By contrast, no 
signal was detected in 4 negative-control paired sera. The serum titers of anti- hS ARS 
antibodies of the tested patients are shown in Table 1 . 
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Table 1 



Name 


Date 


Lab No. 


Anti-SAKS 


Patient A 


25-Feb-03 


S2728 


<50 




6-Mar-Q3 


S2728 


1600 


Patient B 


26-Feb-03 


S2441 


50 




3-[Wiar-03 


S2441 


200 


Patient C 


4-Mar-OS 


S3279 


200 




14-Mar-03 


S3279 


1600 


Patient D 


6-Mar-03 


M41045 


<50 




1 1 -Mar-03 


MB943/03 


800 


Patient E 


4-Mar-03 


M38953 


<50 




18-Mar-03 


KWH03/3601 


800 


Control F 


13-Feb-03 


M27124 


<50 




1-Mar-03 


MB942968 


<50 


Patient G 


3-Mar-03 


M38685 


<50 




7-Mar-03 


KWH03/2900 


Equivocal 




Blinded samples: 








1a * 


Acute 




<50 


1b 


Convalescent 




1600 


2a * 


Acute 




50 


2b 


Convalescent 




>1600 


3a* 


Acute 




50 


3b 


Convalescent 




>1600 


4a* 


Acute 




<50 


4b 


Convalescent 




<50 


5a* 


Acute 




<50 


5b 


Convaelscent 




<50 


6a * 


Acute 




<50 


6b 


Convalescent 




<50 



NB: * patients with SARS 

These results indicated that this novel member of Coronaviridae is a key pathogen 
5 in SARS. 



6,6 Sequences of the hSARS Virus 

Total RNA from infected or uninfected FrHK-4 cells was harvested two days post- 
infection. One-hundred ng of purified RNA was reverse transcribed using Superscript II 
10 reverse transcriptase (Invitrogen) in a 20 jil reaction mixture containing 10 pg of a 

degenerated primer (5 ? -GCCGGAGCTCTGCAGAATTCNNNNNNN-3 5 ? N=A, T, G or C: 
SEQ ID NO: 5) as recommended by the manufacturer. Reverse transcribed products were 
then purified by a QIAquick PGR purification kit as instructed by the manufacturer and 
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eluted in 30 [il of 10 mM Tris-HCl, pH 8.0 . Three \il of purified cDNA products were add 
in a 25 \d reaction mixture containing 2.5 of lOx PGR buffer, 4 ^1 of 25mM MgCl 2 , 0.5 
III of 10 mM dNTP, 0.25 jlxI of AmpliTaq Gold® DNA polymerase (Applied Biosystems), 
2.5 \xd of [a- 32 P]CTP(Amersham), 2 jil of 10 \iM primer (5'- 
5 GCCGGAGCTCTGCAGAATT-C-3 ? : SEQ ED NO:6). Reactions were thermal cycled 
through the following profile: 94°C for 8 min followed by 2 cycles of 94°C for 1 min, 
40°C for 1 min, 72°C for 2 min. This temperature profile was followed by 35 cycles of 
94°C for 1 min, 60°C for 1 min, 72°C for 1 min. 6 |dl of the PGR products were analyzed in 
a 5% denaturing polyacrylamide gel electrophoresis. Gel was exposed to X-ray film and the 

1 0 film was developed after an over-night exposure. Unique PGR products which were only 
identified in infected cell samples were isolated from the gel and eluted in a 50 |ul of lxTE 
buffer. Eluted PGR products were then re-amplified in 25 \xl of reaction mixture containing 
2.5 til of lOx PGR buffer, 4 ]ul of 25 mM MgCl 2 , 0.5 ^1 ru 10 mM dNTP, 0.25 \xl of 
AmpliTaq Gold® DNA polymerase (Applied Biosystems), 1 jLtl of 10 \iM primer (5 ? - 

1 5 GCCGGAGCTCTGCAGAATTC-3 ' : SEQ ID NO:6). Reaction mixtures were thermal 

cycled through the following profile: 94°C for 8 min followed by 35 cycles of 94°C for 1 
min, 60°C for 1 min, 72°C for 1 min. PGR products were cloned using a TOPO TA cloning 
kit (Invitrogen) and ligated plasmids were transformed into TOP 10 K coli competent cells 
(Invitrogen). PGR inserts were sequenced by a BigDye cycle sequencing kit as 

20 recommended by the manufacturer (Applied Biosystems) and sequencing products were 
analyzed by an automatic sequencer (Applied Biosystems, model number 3770). The 
obtained sequence (SEQ ID NO:l) is shown in Figure 1. The deducted amino acid 
sequence (SEQ ID NO:2) from the obtained DNA sequence showed 57% homology to the 
polymerase protein of identified coronaviruses. 

25 Similarly, two other partial sequences (SEQ ID NOS;l 1 and 13) and deduced amino 

acid sequences (SEQ ID NOS:12 and 14, respectively) were obtained from the hSARS virus 
and are shown in Figures 8 (SEQ ID NOS:ll and 12) and 9 (SEQ ID NOS:13 and 14). 

The entire genomic sequence of hSARS virus is shown in Figure 10 (SEQ ID 
NO: 15). The deduced amino acid sequences of SEQ ID NO: 15 in all three frames (SEQ ID 

30 NO: 16, 240 and 737) are shown in Figure 11 (SEQ ID NOS: 17-239, 241-736 and 738- 

1 107). The deduced amino acid sequences of the complement of SEQ ID NO: 15 in all three 
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frames (SEQ ID NOS: 1108, 1590 and 1965) are shown in Figure 12 (SEQ ID NOS:1109- 
1589, 1591-1964 and 1966-2470). 

6.7 Detection of hSARS Virus in Nasopharyngeal Aspirates 

5 First, the nasopharyngeal aspirates (NPA) were examined by rapid 

immunoflourescent antigen detection for influenza A and B, parainfluenza types 1, 2 and 3, 
respiratory syncytial virus and adenovirus (Chan KH, IVlaldeis N, Pope W, Yup A, Ozinskas 
A. Gill J, Seto WH ? Shortridge KF, Peiris JSM. Evaluation of Directigen Fly A+B test for 
rapid diagnosis of influenza A and B virus infections. J Clin Microbiol 2002; 40: 1675- 

10 1 680) and were cultured for conventional respiratory pathogens on Mardin Darby Canine 
Kidney, LLC-Mk2, RDE, Hep-2 and MRC-5 cells (Wiedbrauk DL, Johnston SLG. Manual 
of clinical virology. Raven Press, New York. 1993). Subsequently, fetal rhesus kidney 
(FRhk-4) and A-549 cells were added to the panel of cell lines used. Reverse transcription 
polymerase chain reaction (RT-PCR) was performed directly on the clinical specimen for 

15 influenza A (Fouchier RA, Bestebroer TM, Herfst S, Van Der Kemp L, Rimmelzwan GF, 
Osterhaus AD. Detection of influenza A virus from different species by PGR amplification 
of conserved sequences in the matrix gene. J Clin Microbiol 2000; 38: 4096-101) and 
human metapneumo virus (HMPV). The primers used for HMPV were: for first round, 5'- 
AARGTS AATGCATCAGC-3 ' (SEQ ID NO. 7) and 5 ' -CAKATTYTGCTTATGCTTTC- 

20 3' (SEQ ID NO: 8); and nested primers: 5'-ACACCTGTTACAATACCAGC-3' (SEQ ID 
NO:9) and 5'-GACTTGAGTCCCAGCTCCA-3' (SEQ ID NO: 10). The size of the nested 
PGR product was 201 bp. An ELISA for mycoplasma was used to screen cell cultures 
(Roche Diagnostics GmbH, Roche, Indianapolis, USA). 

25 RT-PCR Assay 

Subsequent to culturing and genetic sequencing of the hS ARS virus from two 
patients {see Section 6.6, supra), an RT-PCR was developed to detect the hSARS virus 
sequence from NPA samples. Total RNA from clinical samples was reverse transcribed 
using random hexamers and cDNA was amplified using primers 5 ? -TACACACCTCAGC- 

30 GTTG-3 ' (SEQ ID NO:3) and 5 ' -C ACGAACGTGACGAAT-3 ' (SEQ ED NO:4), which are 
constructed based on the in the presence of 2.5 mM MgCl 2 (94 °C for 8 min followed by 40 
cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min). 
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The summary of a typical RT-PCR protocol is as follows: 

1. RNA extraction 

RNA from 140 \xl of NPA samples is extracted by QXAquick viral RNA extraction 
5 kit and is eluted in 50 \x\ of elution buffer. 

2. Reverse transcription 

RNA 11.5 Ml 

0.1MDTT 2\x\ 
10 5x buffer 4 \x\ 

lOmMdNTP 1 |xl 

Superscript II, 200 U/|ul (Invitrogen) 1 yl 
Random hexamers, 0.3 \xg/ \xl 0.5 |Ltl 

15 Reaction condition 42 °C ? 50 min 

94 °C, 3 min 
4°C 

3. PGR 

20 cDNA generated by random primers is amplified in a 50 ul reaction as follows: 

cDNA 2 jlxI 

lOmMdNTP 0.5 [xl 

1 Ox buffer 5 |lx1 

25 25 mM MgCl 2 5 pi 

25 juM Forward primer 0.5 jitl 

25 |aM Reverse primer 0.5 |ul 
AmpliTaq Gold polymerase, 5U/[il (Applied Biosystems) 0.25 jjlI 

Water 36.25 jul 

30 
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Thermal-cycle condition: 95°C, 10 min, followed by 40 cycles of 95 °C, 1 min; 
50°C 1 min; 72 °C ? 1 min. 

4. Primer sequences 

5 Primers were designed based on the KNA-dependent RNA polymerase encoding 

sequence (SEQ ID NO:l) of the hSARS virus. 

Forward primer: 5 ? TACACACCTCAGCGTTG 3' (SEQ ID NO:3) 
Reverse primer: 5 5 CACGAACGTGACGAAT 3 5 (SEQ IDNO:4) 

1 0 Product size: 1 82 bps 

Real-Time Quantitative PCR Amy 

Total RNA from 140 |ul of nasopharyngeal aspirate (NPA) was extracted by 
QIAamp® virus RNA mini kit (Qiagen) as instructed by the manufacturer. Ten \il of eluted 

15 RNA samples were reverse transcribed by 200 U of Superscript n® reverse transcriptase 
(Invitrogen) in a 20 jixl reaction mixture containing 0. 15 jag of random hexamers, 10 mmol/1 
DTT, and 0.5 mmol/1 dNTP, as instructed. Complementary DNA was then amplified in a 
SYBR® Green I fluorescence reaction (Roche) mixtures. Briefly, 20 jllI reaction mixtures 
containing 2 |al of cDNA 3 3.5 mmol/1 MgCl 2 , 0.25 jamol/l of forward primer (5 - 

20 TAC ACACCTCAGCGTTG-3 '; SEQ ID NO:3) and 0.25 (amol/1 reverse primer (5'- 

C ACG AACGTGACGAAT-3 ' ; SEQ ID NO:4) were thermal-cycled by a Light-Cycler 
(Roche) with the PCR program, [ 95°C, 10 min followed by 50 cycles of95°C, 10 min; 
57°C, 5 sec; 72°C 9 sec]. Plasmids containing the target sequence were used as positive 
controls. Fluorescence signals from these reactions were captured at the end of extension 

25 step in each cycle. To determine the specificity of the assay, PCR products (184 base pairs) 
were subjected to a melting curve analysis at the end of the assay (65°C to 95°C ? 0. 1 °C per 
second). 

6.8 Detection of N-gene of hSARS Yims in Patients 

30 6,8.1 RT-PCR diagnosis protocol for coronavirus in SARS patients 

Equipment required (for 96 samples): 
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1 x SV Total RNA Isolation system 

2 x Mega titer plate 
3'x 96-well PGR plate 

1 x 0.5-10 |Lil multi- channel pipette 
1 x 10-100 \xl multi-channel pipette 
1 x 20-200 jul multi-channel pipette 
1 x vacuum pump 

1 x swing-bucket rotor with mierotest plate buckets 

2 x PGR machine (96-well plate compatible) 
1 x Gel electrophoresis apparatus 

Station 1* - clinical samples handling (1 medical officer/ clinical technician) 

• Aliquot 500 \xl sample in viral transport medium (containing, per liter, 2g of sodium 
bicarbonate, 5 g of bovine serum albumin, 200 [ig of vancomycin, 18 ^xg of amikacin, 
and 160 U of nystatin in Earle's balanced salt solution) from each individual vial 
into a well of 96-well mega titer plate containing 500 \xl lysis buffer (lx) containing 
100 |al PK-15 cell (ATCC CCL-33; 5.0x10 s cell/ml) in complete minimum essential 
medium with Earle's salt (EMEM, Invirtogen) as internal control** 

• Mix the lysate by pipetting up-and-down 3 times 

• Proceed to Station 2. 

* Station 1 should be carried out inside Class III biological safety cabinet. 

** At least two negative samples should be included in a 96-well platform as a negative 

control. 

Station 2 - Total RNA extraction (1 laboratory technician) 

• Set up the Vacuum Manifold unit. Place the binding plate onto the Manifold Base. 

• Transfer the lysate from mega titer plate to each well of the S V 96 Binding Plate 
(binding plate). 

o Apply vacuum until the lysate passes through the binding plate. Release vacuum. 
® Add 500 \x\ of SV RNA Wash Solution (wash solution) to each well of binding plate. 
© Apply vacuum until the wash solution passes through binding plate. Release vacuum. 
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Prepare DNase incubation mix for an entire 96-well plate as below: 
Yellow Core Buffer 2 ml 



0.09MMnCl 2 



250 ul 
250 ul 



DNase I 



5 

© 

O 

10 



15 



20 



25 

© 

30 



Apply 25 jixl freshly prepared DNase incubation mix directly to the membrane of the 
binding plate. 

Incubate at 20-25°C for 10 minutes. 

Add 200 jllI of SV DNase Stop Solution to each well of the binding plate. 

Apply vacuum until the SV DNase Stop Solution passes through the binding plate. 

Release vacuum. 

Add 500 |xl wash solution to each well of the binding plate. 

Apply vacuum until wash solution passes through the binding plate. Turn off 

vacuum. 

Spin the binding plate at 3000 xg for 30 seconds to remove residue wash solution. 
Transfer the binding plate on top of a 96-well RT plate. 

Add 50 \xl nuclease-free water into each well of the binding plate to elute RNA. 

Incubate at room temperature for 1 minute. 

Spin the binding plate at 3000 xg, 4°C for 1 minute. 

Collect eluted RNA in the 96-well RT plate. 

Add 5 |lx1 of 3 M sodium acetate and 200 jixl of 95% ethanol into each well of the 
plate. 

Place the RT plate on ice and incubate for 15 minutes. 
Spin the plate at 3000 xg, 4°C, 15 minutes. 

Discard supernatant by inverting the plate and blotting on a clean paper towel. 
Wash the pellet with 200^1 of 70% ethanol. 
Spin the plate at 3000xg, 4°C, 10 minutes. 

Discard supernatant by inverting the plate and blotting on a clean paper towel. 

Air-dry the pellet for 5 minutes. 

Add 12 jj,l of nuclease-free water into each well. 

Vortex the plate briefly to dissolve the pellet (for an example result, see Fig. 18). 
Proceed to Station 3. 
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Station 3 - Reverse transcription (1 laboratory technician) 

• Prepare RT master mix for an entire 96-well plate in a 1.5 -ml tube as below (100 
reactions): 

5 







Per Reaction 


xlOO 


□ 


Random hexamers, 3pg/pl 


0.05 pi 


5 pi 


a 


DNTPs, 10 mM 


1 pi 


100 pi 


□ 


First-strand buffer, 5x 


4 pi 


400 jjlI 


□ 


DTT, 0.1 M 


2 pi 


200 ul 


■ 


Superscriot II. 200U/ul 


1 ul 


100 pi 




Total 


8.05 ill 


805 pi 



• Aliquot 100 |lx1 RT mix into 8 wells of a clean 96-well master mix plate. 

15 • From this plate, transfer 8.05 jal RT mix to each well of RT plate containing 12 jutl 

RNA, mix by pipetting up-and-down for 3 times with a multi channel pipette. 
REPLACE TIP AFTER EACH TRANSFER. 

• Incubate the samples at 42°C for 50minutes followed by 70°C for 15 minutes. 

• Proceed to Station 4. 

20 

Station 4 - N-gene specific PGR (1 laboratory technician) 

■ Prepare PGR master mix for an entire 96-well plate in two 2059 culture tubes as 
below (100 reactions): 

N-specific PGR Control PGR 



Per 25 ul Reaction 


xlOO 


Per 25 pi Reaction 


xlOO 


mQH 2 0 


18.65 ul 


1865 pi 


17.65 pi 


1765 pi 


lOxPCR buffer 


2.5 ul 


250 pi 


2.5 pi 


250 pi 


25 mM MgC12 


1.5 pi 


150 pi 


2.5 pi 


250 pi 


lOmMdNTPs 


0.25 pi 


25 pi 


0.25 pi 


25 pi 


Forward primer 10 pM 


0.5 pi 


50 pi 


0.5 pi 


50 pi 


Reverse primer 1 0 pM 


0.5 pi 


50 pi 


0.5 pi 


50 pi 


AmpliTaq Gold® 500 U 


0.1 pi 


10 pi 


0.1 pi 


10 pi 
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Template DNA 



JUiL 



Total 



25 \xl 



2400 pi 



25 pi 



2400 pi 



10 



© N-gene specific PGR and control PGR are performed in two individual PGR plates, 

o Aliquot 290 jLxl PGR master mix into the first column of a 96-well PGR plate. 

© From the first column, aliquot 24 \xl of master mix into each well of PGR plate. 

© Transfer 1 \xl of cDNA template (from station 4) into each well of PGR plate. 

© Mix by pipetting up-and-down for 3 times with a multi-channel pipette. REPLACE 

TIP AFTER EACH TRANSFER. 

® Seal the plate with sealing tape. 

• Perform the following reaction in two 96-well PGR machines: 



15 



N-gene specific PGR 
94°C 10 minutes 
94°C 30 seconds 
56°C 30 seconds j 40 cycles 
72°C 30 seconds 
72°C 10 minutes 



Control PGR 

94°C 10 minutes 

94°C 30 seconds^ 

55°C 30 seconds f 35 cycles 

72°C 45 seconds 

72°C 10 minutes 



20 Station 5 - Gel electrophoresis (1 laboratory technician) 

• Mix 5 Ml of N-gene specific PCR product and 5|ml control PGR product with 1 m1 
bromophenol blue loading dye 

• Load the samples into the wells of a 2% agarose gel. 

• Electrophoresize the PCR products at 140V, 250mA for 30 minutes. 
25 • Stain the gel with ethidium bromide. 

• Visualize the products with UV and record the result. 



6,8,2 Using primers of SEQ ID NOS:2480 and 2481 

RT-PCR diagnostic protocol was performed as described in Section 6.8.1 with some 
30 modifications. 
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RNA isolation from clinical samp les 

Clinical samples including nasopharyngeal aspirates (NPA) and stool specimens 
were provided by the Department of Microbiology, The University of Hong Kong. In 
addition, tracheal dispersion and lung biopsy from an index patient A described in New 
5 Engl J. Med 348:1967-76 (by Drosten C.S., et at, 2003) at three time points was also 
collected. Sample collection was conducted from April 1 to April 28, 2003 in local 
hospitals. Method of sample collection was described in the previous section (also see, 
Poon et al. 7 2003, Clinical Chemistry 49:953-955). Total RNA extraction from clinical 
samples was carried out with SV96 Total RNA Isolation System (Promega, WI, USA), with 

10 following modifications from manufacturer' s protocol. Five-hundred (500) pi of 
NP A/stool sample in viral transport medium (containing, per liter, 2 g of sodium 
bicarbonate, 5 g of bovine serum albumin, 200 pg of vancomycin, 18 pg of amikacin, and 
1 60 U of nystatin in Earle' s balanced salt solution) was mixed with equal volume of SV 
RNA Lysis Buffer containing 100 pi of pig kidney epithelial (PK-15) cell (ATCC CCL-33; 

15 5 . Ox 1 0 5 cells/ml) in complete minimum essential medium with Earle 9 s salt (EMEM, 
Invitrogen) as internal control. The mixture was transferred to the wells of the S V 96 
Binding Plate. After washing with 500 pi of SV RNA Wash Solution prior to elution step, 
the plate was spun at 3000 xg for 30 seconds to remove residue wash solution. RNA was 
then eluted with 50 pi of nuclease-free water, and was collected in a clean 96-well PGR 

20 plate by spinning the plate at 3000 xg for 1 minute. Eluted RNA was then concentrated by 
incubating on ice for 15 minutes, in the presence of 5 pi of 3 M sodium acetate and 200 pi 
of 95 % ethanol. After centrifugation at 3000 xg, 4°C for 15 minutes, RNA pellet was 
washed with 200 pi of 75% ethanol and dissolved with 12 pi of nuclease-free water. 
Extracted RNA was immediately reverse-transcribed to first-strand cDNA. 

25 

First-Strand cDNA Synthesis 

Reverse-transcription was performed with 200 U of Superscript® II reverse 
transcriptase (Invitrogen, USA) in a 20 pi reaction containing 0.15 pg of random hexamers, 
RT buffer (lx), 10 mM dithiothreitol (DTT) and 0.5 mM deoxynucleotide triphosphates 
30 (dNTPs). Reaction was carried out in Peltier Thermal Cycler (MJ Research) with the 
following conditions: 50 minutes at 42°C followed by 15 minutes at 70°C. 
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Polymerase Chain Reaction (PCR) 

Primers were designed according to complete SARS CoV genomic sequence of a 
local specimen HK-39 announced previously (accession no. AY278491). Forward primer 
(SRS251: 5 ' -GCAGTC AAGCCTCTTCTCG-3 ' ; SEQ ID NO:2480, corresponding to nt 
28658-28676 of HK-39 SARS genome, i.e., CCTCC200303) and reverse primer (SRS252: 
5 5 -GCCTC AGC AGCAGATTTC-3 ' , SEQ ID NO:2481; corresponding to nt 28866-28883 
of HK-39 SARS genome) amplified a 225 bp fragment from the region of N-gene that 
showed no homology to other coronavirus. Primers amplifying RNA-dependent RNA 
polymerase (lb gene) were used as parallel control (coro3: 5 ' -T AC AC ACCTC AGCGTTG- 
3' (SEQ ID NO:3), corresponding to nt 18041-18057; and coro4: 5'- 
CACGAACGTGACGAAT-3 ' (SEQ ID NO:4), corresponding to nt 18207-18222, 
Department of Microbiology, the University of Hong Kong). Both amplicons were cloned 
into same pCR2. 1 cloning vector (Fig. 17). Serially diluted plasmid was then used to 
determine the dynamic range and optimal condition of the PCRs (Fig. 21 A and 2 IB). 
Another set of primer that amplifying a 745 bp fragment from pig p-actin gene was 
employed as an internal control for the diagnostic PGR assay (actin-F: 5 ' - 
TGAGACCTTCAACACGCC-3 ' (SEQ ID NO:2482); and actin-R: 5' - 
ATCTGCTGGAAGGTGGAC-3 9 (SEQ ID NO:2483)). 

Conventional PCR and gel electrophoresis was carrid out as preliminary experiment. 
Briefly, 1 pi of cDNA from clinical samples was amplified with 0.5 U Taq DNA 
polymerase recombinant (Invitrogen Life Technologies) in a 25-pl reaction containing PCR 
buffer (lx), 1.5 mMMgCl 2 , 0.1 mM dNTPs and 0.5 pmol of each forward and reverse 
primers. Reaction was performed in Peltier Thermal Cycler (MJ Research) with the 
following conditions: 3 minutes at 94 °C, followed by 50 cycles of 94°C for 10 seconds, 56 
°C for 10 seconds, 72 °C for 10 seconds, and a 10-minute final extension step at 72 °C. 
Amplicons were analyzed with 2 % agarose gel electrophoresis (Fig. 23). Quantitative real- 
time PCR using SYBR® SYBR® green fluorophore was performed in diagnosis of clinical 
samples. In a 25 \il reaction, 1 [il cDNA template was mixed with 12.5 \il (2x) Green PCR 
Master Mix (Applied Biosystems) and 0.5 pmol of each forward and reverse primer. 
Volume of the reaction was adjusted to 25 pi with distilled water. Reactions were 
performed in the iCycler iQ Real-Time PCR Detection System (Bio-Rad) under the same 
condition as the conventional PCR. Fluorescence signals (FAM, excitation = 490 nm, 
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emission = 530 nm) were collected at the end of each extension step during the PCR cycles 
(Fig. 22 A). Threshold cycle (Ct) of each sample was determined using maximum curvature 
approach. Melting curve analysis was performed after 10 minutes final extension (Fig. 
22B). cDNA from non-SARS patients, including patients suffering from adenovirus (n = 5), 
5 repiratory syncytia! virus (n = 5). human metapneumovirus (n = 5), influenza A virus (n = 
5), or influenza B virus (n = 5) infection, were used as negative controls for the assay. 

Northern Blot Analysis 

SARS-CoV HK-39 strain infected Vero cell was provided by Department of 
10 Microbiology, the University of Hong Kong. Total RNA was extracted from the cell with 
TRIzol® reagent (Invitrogen Life Technologies) according to the manufacturer' s protocol. 
Eight (8) jig of total RNA was separated by electrophoresis on a 1% agarose gel containing 
3.7% formaldehyde. RNA was transferred to a positively charged nylon membrane Roche 
Diagnostic Corporation) by capillary blotting and fixed by UV cross-linking. cDNA 
1 5 synthesized with the same RNA sample was used as template for probe synthesis. Four 

pairs of primers amplifying fragments from lb (nt 18057 - 18222; SEQ ID NO:2484), S (nt 
21920 - 22107; SEQ ID NO:2485), M (nt 26867 - 26996; SEQ ID NO:2486) and N (nt 
28658 - 28883; SEQ ID NO:2487) gene were used in probe synthesis. DIG-labeling of 
probes, hybridization and detection of bands were performed with the digoxigenin system 
20 according to the manufacturer's procedures (Roche Molecular Biochemcials). Signals were 
then analysed with chemiluminescence (Fig. 24). 

Results and Discussion 

A large-scale RT-PCR assay provides a rapid means in monitoring and screening of 
25 SARS suspects. The result can be used to complement clinical diagnostic evaluation. In 
order to achieve a diagnostic purpose, the assay should be reliable and its accuracy should 
be assured so as to prevent occurrence of both false negative and false positive results. 
However, accuracy of the test may be influenced by several factors. A common technical 
problem with PCR is a failure of amplification due to the presence of PCR inhibitors {see 
30 Fig. 21). 

These PCR inhibitors included heme compounds found in blood, aqueous and 
vitreous humors, heparin, EDTA, urine and polyamines (Fredricks etal, 1998, J. Clin. 
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Micro. 36:2810-16). Currently, NPA or stool samples were collected into transport medium 
to maintain the viability of the viral particles. RT-PCR was inhibited when total RNA 
extracted was used directly for first-strand cDNA synthesis without any treatment (25 out of 
27 samples) in preliminary experiment. However, after a simple ethanol precipitation step, 
5 the amplification of DMA could be retained (Fig. 19). Same result was obtained by either 
using SV or S V96 total RNA Isolation System (data not shown). It demonstrated that some 
components either in the medium or NPA/ stool samples would affect the downstream 
processes of the diagnosis test. 

In addition, current sample collection procedure dilutes the virus titer in the samples, 

10 especially during early stage of infection, in which the virus titer is low in nasal and throat 
swab specimens (Drosten et al, 2003, New England Journal of Medicine, on-line at 
http://content.nejm.org/cgi/reprint/NEJMoa030747v2). It was suggested that the sensitivity 
of PGR tests for SARS depended on the quality of the specimen and the time of testing 
during the course of the illness. In order to increase sensitivity of the test, total RNA 

15 isolated from clinical samples was concentrated prior to 1st strand cDNA synthesis. 

In order to avoid false negative PGR results due to failure in the process of RNA 
isolation and 1st strand cDNA synthesis, total RNA was extracted from clinical samples in 
parallel with PK-15 mammalian cells. Figure 23 showed the RT-PCR screening result on 
48 clinical samples, including both NPA and stool samples. Diagnostic PGR was 

20 performed in parallel with f5-actin PGR. All samples were positive in p-actin PGR, The 
result indicated that RNA and cDNA could be extracted and synthesized successfully from 
the samples in a single-step protocol as disclosed herein. With this internal control, total 
RNA isolation and cDNA synthesis from the samples were ensured, which eliminated false 
negative that resulted from failure in either one of the above processes. Moreover, 96-well 

25 assay format currently developed can be adopted into a high-throughput screening protocol, 
with which we are able to obtain diagnostic result of more than 90 clinical samples in 3 
hours with 1 clinical personnel, while the current existing protocol, in which samples are 
proceeded in individual tubes, can only handle about 30-50 samples a day per technician. 
Real-time quantitative PGR assay is more sensitive than conventional agarose gel- 

30 electrophoresis- associated PGR assay (Poon et al. 9 J. Clin. Virol. 28:233-8) and therefore 
employed for SARS-CoV diagnosis purpose. Positive signals were detected in 38 of 136 
randomly selected clinical samples in both N-gene and lb-gene specific PGR. Among these 
38 positives, 3 were stool samples (2.2%) and 35 were NPA samples (25.7%). Detection 
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rate of the assay employing N-gene specific RT-PCR at different time points was shown in 
Table 2. 

Table 2 

"Date of onset No. of sample No. of positive Detection rate (%) 

1-2 15 2 13.3 

3-4 17 4 23.5 

5-6 15 4 26.7 

7-8 13 5 38.5 

9-10 9 4 44.4 
Negative control 1 9 All negative 



10 



15 Affirmative of these 38 positive cases was confirmed by melting curve analysis of 

PCR products. Specific melting temperature of N gene and lb gene PGR products (85.5°C 
and 80.5°C ? respectively) indicated that the target framgments were amplified in the reaction. 
Specificity of the assay was also validated with non-SARS patients samples, including 
patients suffering from adenovirus (n = 5), repiratory syncytial virus (n = 5), human 

20 metapneumovirus (n = 5), influenza A virus (n = 5) and influenza B virus (n = 5). The 
result shows that all of these samples were negative in the assay (Fig- ??). These results 
indicate that the N-gene specific RT-PCR assay is specific for SARS-CoV diagnosis. 

Furthermore, we also demonstrated that the N-gene specific PCR was more sensitive 
than that of PCR amplifying lb RNA polymerase gene. Amplification conditions for both 

25 PCR assays were optimized (see Fig. 22) first with the plasmid construct containing 1 : 1 

ratio of lb- and N-gene fragment (see Fig. 20). Dynamic range of N-gene specific PCR was 
obtained (Fig. ??) and it was found to be with lower Ct values than that of lb-specific PCR. 
This revealed that N-gene specific PCR could achive higher amplification efficiency than 
lb-gene specific PCR when using same copy number of template. PCR with cDNA from 

30 clinical samples or virus infected Vero cells were then performed. Figure 22A shows the Ct 
and half-maximal values of the fluorescent signal of N gene and lb gene specific PCR 
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generated from NPA, tracheal dispersion and lung biopsy from patient A. The results 
indicated that fluorescent signals given in N gene specific PCR are higher (26.0% in 
average 3 ranged from 6.3-60%) than that of lb specific PCR in all positive samples. 
Furthermore, Ct values of N gene specific PCR are lower (0.1-4.6 cycles) than that of lb 
5 specific PCR among most of the SARS-CoV positive samples (Table 3). 

Table 3 
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ACt = 1.49 ± 0.47, 95% confidence intervals = 0.74 to 2.23 (F-test) 

10 Statistic analysis indicates that Ct of N-gene PCR assay is significantly lower than 

that of ab-gene assay (95% confidence interval = 0.74 to 2.23, F-test). Stronger 
fluorescence signals and lower Ct values of N gene specific PCR provide a more sensitive 
diagnostic result and much target for the assay. 

Using cDNA from SARS-CoV infected Vero cells, amplification curves shown in 

1 5 Fig. 21 B show the differences between N gene and lb gene specific PCR. Ct of the N gene 
and lb gnene specific PCR was 35.3 and 37.8, respectively. This phenomenon had two 
main causes: (1) Expression level of N gene was higher than that of lb gene; and; (2) Copy 
number of N gene was much larger than that of lb gene because each transcript preceded a 
copy of N gene, in SARS-CoV infected cells. Northern blot analysis supported this 

20 hypothesis (Fig. 24). When N-gene specific PCR product was used as a probe, at least five 
transcripts from the virus were hybridized and gave positive signals (Figure 24). This result 
agreed with the findings in which five subgenomic mRNAs were detected by Northern 
hybridization of RNA from SARS-CoV infected cells using a probe derived from the 3' 
untranslated region (Rota et al 9 2003, Science 300:1394-99). On the other hand, when lb 

25 PCR product was used as a probe, only 2 transcripts with high molecular size were 

hybridized, demonstrating that the copy number of N gene was much higher than that of lb 
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gene, during transcription and gene expression in the host cells. The Northern hybridization 
result strongly supports the conclusion that PCR amplifying regions in N gene of the SARS- 
CoV are more sensitive than other regions as a target for diagnostic screening. It is possible 
that amplification of more than one genome region may increase the specificity of the test 
5 (Yam W.C., etal, 2003, X Clin, Microbiol 41:4521-24). 

In conclusion, we have developed a new generation of RT-PCR diagnosis test which 
is more sensitive than conventional diagnostic test for the detection of the coronavirus 
associated with SARS. The assay provides a high throughput, highly sensitive screening 
platform, which enables us to scale up to test hundreds of thousands of suspected SARS 
10 cases each day in a single working line. Incorporation of PK- 1 5 cell as an internal control 
in the assay and use of N gene as a diagnosis locus in addition to lb gene can enhance the 
sensitivity and accuracy of the test. We are adapting the protocol to 96-well real-time 
quantitative PCR and sequencing format to shorten the time required for the test and to 
obtain information on genotypic variation of the virus. 

15 

CLINICA1 RESULTS 

Clinical findings; 

All 50 patients with SARS were ethnic Chinese. They represented 5 different 
epidemiologically linked clusters as well as additional sporadic cases fitting the case 

20 definition. They were hospitalized at a mean of 5 days after the onset of symptoms. The 

median age was 42 years (range of 23 to 74) and the female to male ratio was 1.3. Fourteen 
(28%) were health care workers and five (10%) had a history of visit to a hospital 
experiencing a major outbreak of SARS. Thirteen (26%) patients had household contacts 
and 12 (24%) others had social contacts with patients with SARS. Four (8%) had a history 

25 of recent travel to mainland China. 

The major complaints from most patients were fever (90%) and shortness of breath. 
Cough and myalgia were present in more than half the patients (Table 4). Upper respiratory 
tract symptoms such as rhinorrhea (24%) and sore throat (20%) were present in a minority 
of patients. Diarrhea (10%) and anorexia (10%) were also reported. At initial examination, 

30 auscultatory findings, such as crepitations and decreased air entry, were present in only 38% 
of patients. Dry cough was reported by 62% of patients. All patients had radiological 
evidence of consolidation, at the time of admission, involving 1 zone (in 36), 2 zones (13) 
and 3 zones (1). 
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Table 4 

Clinical symptoms Number (percentage) 
Fever 50 (100%) 

Chill or rigors 37 (74%) 

Cough 3 1 (62%) 

Myalgia 27 (54%) 

Malaise 25 (50%) 

Running nose 12 (24%) 

Sore throat 10(20%) 

Shortness of breath 1 0 (20%) 

Anorexia 10 (20%) 

Diarrhea 5 (10%) 

Headache 10 (20%) 

Dizziness 6 ^ 1 9tv: 



* Truncal maculopapular rash was noted in 1 patient. 



5 In spite of the high fever, most patients (98%) had no evidence of a leukocytosis. 

Lymphopenia (68%), leucopema (26%), thrombocytopenia (40%) and anemia (18%) were 
present in peripheral blood examination (Table 5). Parenchymal liver enzyme, alanine 
aminotransferase (ALT) and muscle enzyme, creatinine kinase (CPK) were elevated in 34% 
and 26% respectively. 
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Table 5 



Laboratory parameter 



Mean (range) Percentage of abnormal Normal range 



12.9(8,9-15.9) 



5.17(1.1-11.4) 



0.78 (0.3 - 1.5) 



174 (88-351) 



Haemoglobin 

Anaemia 
White cell count 

Leucopenia 
Lymphocyte count 

Significant lymphopenia 

(<1.0xl0 9 /L) 
Platelet count 

Thrombocytopenia 
Alanine aminotransaniinase (ALT) 63 (11 - 350) 

Elevated ALT 
Albumin 37 (26 - 50) 

Low albumin 
Globulin 33 (21 -42) 

Elevated globulin 
Creatinine kinase 244 (31 - 1379) 

Elevated creatinine kinase 



9 (18%) 



13 (26%) 



34 (68%^ 



20 (40%) 



17 (34%) 



34 (68%) 



10 (20%) 



13 (26%) 



11.5- 16.5 g/dl 



4-11 x 10 9 /L 



1.5-4.0X 10 9 /L 



150 -400xl0 9 /L 



6 - 53 U/L 



42 - 54 g/L 



24 - 36 g/L 



34 - 138 U/L 



Routine microbiological investigations for known viruses and bacteria by culture, 
antigen detection, and PCR were negative in most cases. Blood culture was positive for 
Escherichia coli in a 74-year-old male patient, who was admitted to intensive care unit, and 
was attributed to hospital acquired urinary tract infection. Klebsiella pneumoniae and 
Hemophilus influenzae were isolated from the sputum specimens of 2 other patients on 
admission. 

Oral levofloxacin 500 mg q24h was given in 9 patients and intravenous (1.2 g q8h)/ 
oral (375 mg tid) amoxicillin-clavulanate and intravenous/oral clarithromycin 500 mg ql2h 
were given in another 40 patients. Four patients were given oral oseltamivir 75 mg bid. In 
one patient, intravenous ceftriaxone 2 gm q24h, oral azithromycin 500 mg q24h, and oral 
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amantadine 100 mg bid were given for empirical coverage of typical and atypical 
pneumonia. 

Nineteen patients progressed to severe disease with oxygen desaturation and were 
required intensive care and ventilatory support. The mean number of days of deterioration 
from the onset of symptoms was 8.3 days. Intravenous ribavirin 8 mg/kg q8h and steroid 
was given in 49 patients at a mean day of 6.7 after onset of symptoms. 

The risk factors associated with severe complicated disease requiring intensive care 
and ventilatory support were older age, lymphopenia, impaired ALT, and delayed initiation 
of ribavirin and steroid (Table 6). All the complicated cases were treated with ribavirin and 
steroid after admission to the intensive care unit whereas all the uncomplicated cases were 
started on ribavirin and steroid in the general ward. As expected, 3 1 uncomplicated cases 
recovered or improved whereas 8 complicated cases deteriorated with one death at the time 
of writing. All 50 patients were monitored for a mean of 12 days at the time of writing. 
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Table 6 





Complicated 

case 
(n= 19) 


Uncomplicated 

case 

(n=31) 


P value 


Mean (SD) age (range) 


49.5 ± 12.7 


39.0 + 10.7 


P<0.01 


Male / Female ratio 


8/11 


14/17 


N.S. 


Underlying illness 


5t 


1* 


P < 0.05 


Mode of contact 








Travel to China 


1 


3 


N.S. 


Health care worker 


5 


9 


N.S. 


Hospital visit 


1 


4 


N.S. 


Household contact 


8 


5 


P < 0.05 


Social contact 


4 


10 


N.S. 


Mean (SD) duration of symptoms to 


5.2 + 2.0 


4.7 + 2.5 


N.S. 


admission (days) 








Mean (SD) admission temperature (°C) 


38.8 + 0.9 


38.7 + 0.8 


N.S. 


Mean (SD) initial total peripheral WBC 
count (x 10 9 /L) 


5.1+2.4 


5.2+1.8 


N.S. 








Mean (SD) initial lymphocyte count 
(xlOVL) 


0.66 + 0.3 


0.85 + 0.3 


P < 0.05 








Presence of thrombocytopenia 
(<150xl0 9 /L) 


8 


12 


N.S. 








Impaired liver function test 


11 


6 


P <0.01 


CXR changes (number of zone affected) 


1.4 


1.2 


N.S. 


Mean (SD) day of deterioration from the 


8.3+2.6 


Not applicable 

X ST 




onset of symptoms § 








Mean (SD) day of initiation of Ribavirin 


7.7 + 2.9 


5.7 + 2.6 


P < 0.05 


& steroid from the onset of symptoms 








Initiation of ribavirin & steroid after 


12 


0 


P < 0.001 


deterioration 








Response to ribavirin & steroid 


11 


28 


P < 0.05 


Outcome 








Improved or recovered 


10 


31 


P < 0.01 


Not improving || 


8 


0 


P < 0.01 



* Multi-variant analysis is not performed due to low number of cases; 



x 2 patients had diabetic mellitus, 1 had hypertrophic ostructive cardiomyopathy, 1 
5 had chronic active hepatitis B ? and 1 had brain tumour; 

* 1 patient had essential hypertension; 
§ desaturation requiring intensive care support; 
|| 1 died. 
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Two virus isolates, subsequently identified as a member of Coronaviridae (see 
below), were isolated from two patients. One was from an open lung biopsy tissue of a 53- 
y ear-old Hong Kong Chinese resident and the other from a nasopharyngeal aspirate of a 42 
year-old female with good previous health. The 53-year old male had a history of 10-hour 
5 household contact with a Chinese visitor who came from Guangzhou and later died from 
SARS. Two days after this exposure, he presented with fever, malaise, myalgia, and 
headache. Crepitations were present over the right lower zone and there was a 
corresponding alevolar shadow on the chest radiograph. Hematological investigation 
revealed lymphopenia of 0.7 x 10 9 /1 with normal total white cell and platelet counts. Both 
10 ALT (41 U/L) and CPK (405 U/L) were impaired. Despite a combination of oral 

azithromycin, amantadine, and intravenous ceftriaxone, there was increasing bilateral 
pulmonary infiltrates and progressive oxygen desaturation. Therefore, an open lung biopsy 
was performed 9 days after admission, Histopathological examination showed a mild 
interstitial inflammation with scattered alveolar pneumocytes showing cytomegaly, granular 
1 5 amphophilic cytoplasm and enlarged nuclei with prominent nucleoli. No cells showed 

inclusions typical of herpesvirus or adenovirus infection. The patient required ventilation 
and intensive care after the operative procedure. Empirical intravenous ribavirin and 
hydrocortisone were given. He succumbed 20 days after admission. In retrospect, 
coronavirus-like RNA was detected in his nasopharyngeal aspirate, lung biopsy and post- 
20 mortem lung. He had a significant rise in titer of antibodies against his own hS ARS isolate 
from 1/200 to 1/1600. 

The second patient from whom a hS ARS virus was isolated, was a 42-year-old 
female with good past health. She had a history of travel to Guangzhou in mainland China 
for 2 days. She presented with fever and diarrhea 5 days after her return to Hong Kong. 
25 Physical examination showed crepitation over the right lower zone which had a 

corresponding alveolar shadow on the chest radiograph. Investigation revealed leucopenia 
(2.7 x 107L), lymphopenia (0.6 x 10 9 /L), and thrombocytopenia (104 x 107L). Despite the 
empirical antimicrobial coverage with amoxicillin-clavulanate, clarithromycin, and 
oseltamivir, she deteriorated 5 days after admission and required mechanical ventilation and 
30 intensive care for 5 days. She gradually improved without receiving treatment with 

ribavirin or steroid. Her nasopharyngeal aspirate was positive for the virus in the RT-PCR 
and she was seroconverted from antibody titre <l/50 to 1/1600 against the hSARS isolate. 
yirolpgical findings : 
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Viruses were isolated on FRhk-4 cells from the lung biopsy and nasopharyngeal 
aspirate respectively, of two patients described above. The initial cytopathic effect 
appeared between 2 and 4 days after inoculation, but on subsequent passage, cytopathic 
effect appeared in 24 hours. Both virus isolates did not react with the routine panel of 
5 reagents used to identify virus isolates including those for influenza A, B parainfluenza 
types 1,2,3, adenovirus and respiratory syncytial virus (DAKO, Glostrup, Denmark). They 
also failed to react in RT-PCR assays for influenza A and HMPV or in PGR assays for 
mycoplasma. The virus was ether sensitive, indicating that it was an enveloped virus. 
Electron microscopy of negatively stained (2% potassium phospho-tungstate, pH 7.0) cell 

10 culture extracts obtained by ultracentrifugation showed the presence of pleomorphic 
enveloped viral particles, of about 80-90 nm (ranging 70-130 nm) in diameter, whose 
surface morphology appeared comparable to members of Coronaviridae (Figure 5a). Thin 
section electron microscopy of infected cells revealed virus particles of 55-90 nm diameter 
within the smooth-walled vesicles in the cytoplasm (Figure 5b). Virus particles were also 

1 5 seen at the cell surface. The overall findings were compatible with infections in the cells 
caused by viruses of Coronaviridae, 

A thin section electron micrograph of the lung biopsy of the 53 year old male 
contained 60-90-nm viral particles in the cytoplasm of desquamated cells. These viral 
particles were similar in size and morphology to those observed in the cell-cultured virus 

20 isolate from both patients (Figure 4). 

The RT-PCR products generated in a random primer RT-PCR assay were analyzed 
and unique bands found in the virus infected specimen was cloned and sequenced. Of 30 
clones examined, a clone containing 646 base pairs (SEQ ID NO: 1) of unknown origin was 
identified. Sequence analysis of this DNA fragment suggested this sequence had a weak 

25 homology to viruses of the family of Coronaviridae (data not shown). Deducted amino 

acid sequence (215 amino acids: SEQ ID NO:2) from this unknown sequence, however, had 
the highest homology (57%) to the RNA polymerase of bovine coronavirus and murine 
hepatitis virus, confirming that this virus belongs to the family of Coronaviridae. 
Phylogenetic analysis of the protein sequences showed that this virus, though most closely 

30 related to the group II coronaviruses, was a distinct virus (Figures 5a and 5b). 

Based on the 646 bp sequence of the isolate, specific primers for detecting the new 
virus was designed for RT-PCR detection of this hS ARS virus genome in clinical 
specimens. Of the 44 nasopharyngeal specimens available from the 50 SARS patients, 22 
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had evidence of hSARS RNA. Viral RNA was detectable in 10 of 18 fecal samples tested. 
The specificity of the RT-PCR reaction was confirmed by sequencing selected positive RT- 
PCR amplified products. None of 40 nasopharyngeal and fecal specimens from patients 
with unrelated diseases were reactive in the RT-PCR assay. 
5 To determine the dynamic range of real-time quantitative PGR, serial dilutions of 

plasmid DNA containing the target sequence were made and subjected to the real-time 
quantitative PGR assay. As shown in Figure 7 A, the assay was able to detect as little as 10 
copies of the target sequence. By contrast, no signal was observed in the water control 
(Figure 7 A). Positive signals were observed in 23 out of 29 serologically confirmed SARS 

10 patients. In all of these positive cases, a unique PGR product (T m = 82°C) corresponds to 
the signal from the positive control was observed (Figure 7B, and data not shown). These 
results indicated this assay is highly specific to the target. The copy numbers of the target 
sequence in these reactions range from 4539 to less than 10. Thus, as high as 6.48 x 10 5 
copies of this viral sequence could be found in 1 ml of NPA sample. In 5 of the above 

15 positive cases, it was possible to collect NPA samples before seroconvertion. Viral RNA 
was detected in 3 of these samples, indicating that this assay can detect the virus even at the 
early onset of infection. 

To further validate the specificity of this assay, NPA samples from healthy 
individuals (n=l 1) and patients suffered from adenovirus (n=l 1), respiratory syncytial virus 

20 (n=l 1), human metapneumovirus (n=l 1), influenza A virus (n=13) or influenza B virus 

(n=l) infection were recruited as negative controls. All of these samples, except one, were 
negative in the assay. The false positive case was negative in a subsequence test. Taken 
together, including the initial false positive case, the real-time quantitative PGR assay has 
sensitivity of 79% and specificity of 98 %. 

25 Epidemiological data suggest that droplet transmission is one of the major route of 

transmission of this virus. The detection of live virus and the detection of high copies of 
viral sequence from NPA samples in the current study clearly support that cough and sneeze 
droplets from SARS patients might be the major source of this infectious agent. 
Interestingly, 2 out of 4 available stool samples form the SARA patients in this study were 

30 positive in the assay (data not shown). The detection of the virus in feces suggests that 
there might be other routes of transmission. It is relevant to note that a number of animal 
coronaviruses are spread via the fecal-oral route (Mcintosh K., 1974, Coronaviruses: a 
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comparative review. Current Top Microbiol Immunol 63: 85-112). However, further 
studies are required to test whether the virus in feces is infectious or not. 

Currently, apart form this hSARS virus, there are two known sero groups of human 
coronaviruses (229E and OC43) (Hruskova J. et al P 1990, Antibodies to human 
5 coronaviruses 229E and OC43 in the population of C.R., Acta Virol 34:346-52). The 

primer set used in the present assay does not have homology to the strain 229E. Due to the 
lack of available corresponding OC43 sequence in the Genebank, it is not known whether 
these primers would cross-react with this strain. However, sequence analyses of available 
sequences in other regions of OC43 polymerase gene indicate that the novel human virus 

10 associated with SARS is genetically distinct from OC43 . Furthermore, the primers used in 
this study do not have homology to any of sequences from known coronaviruses. Thus, it is 
very unlikely that these primers would cross-react with the strain OC43 . 

Apart from the novel pathogen, metapneumovirus was reported to be identified in 
some of SARS patients (Center for Disease Control and Prevention, 2003, Morbidity and 

1 5 Mortality Weekly Report 52: 269-272). No evidence of metapneumovirus infection was 
detected in any of the patients in this study (data not shown), suggesting that the novel 
hSARS virus of the invention is the key player in the pathogenesis of SARS. 

Immunofluorescent antibody detection : 

20 Thirty- five of the 50 most recent serum samples from patients with SARS had 

evidence of antibodies to the hSARS (see Fig. 3). Of 27 patients from whom paired acute 
and convalescent sera were available, all were seroconverted or had >4 fold increase in 
antibody titer to the virus. Five other pairs of sera from additional SARS patients from 
clusters outside this study group were also tested to provide a wider sampling of SARS 

25 patients in the community and all of them were seroconverted. None of 80 sera from 

patients with respiratory or other diseases as well as none of 200 normal blood donors had 
detectable antibody. 

When either seropositivity to HP-CV in a single serum or viral RNA detection in the 
NPA or stool are considered evidence of infection with the hS ARS, 45 of the 50 patients 
30 had evidence of infection. Of the 5 patients without any virological evidence of 

Coronaviridae viral infection, only one of these patients had their sera tested > 14 days after 
onset of clinical disease. 
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DISCUSSION 

The outbreak of SARS is unusual in a number of aspects, in particular, in the 
appearance of clusters of patients with pneumonia in health care workers and family 
contacts. In this series of patients with SARS, investigations for conventional pathogens of 
5 atypical pneumonia proved negative. However, a virus that belongs to the family 

Coronaviridae was isolated from the lung biopsy and nasopharyngeal aspirate obtained 
from two SARS patients, respectively. Phylogenetically, the virus was not closely related to 
any known human or animal coronavirus or torovirus. The present analysis is based on a 
646 bp fragment (SEQ ID NO:l) of the polymerase gene, which indicates that the virus 

10 relates to antigenic group 2 of the coronaviruses along with murine hepatitis virus and 
bovine coronavirus. However, viruses of the Coronaviridae can undergo heterologous 
recombination within the virus family and genetic analysis of other parts of the genome 
needs to be carried out before the nature of this new virus is more conclusively defined 
(Holmes KV. Coronaviruses. Eds Knipe DM, Howley PM Fields Virology, 4th Edition, 

15 Lippincott Williams & Wilkins, Philadelphia, 1187-1203). The biological, genetic and 

clinical data, taken together, indicate that the new virus is not one of the two known human 
coronaviruses. 

The majority (90%) of patients with clinically defined SARS had either serological 
or RT-PCR evidence of infection by this virus. In contrast, neither antibody nor viral RNA 

20 was detectable in healthy controls. All 27 patients from whom acute and convalescent sera 
were available demonstrated rising antibody titers to hSARS virus, strengthening the 
contention that a recent infection with this virus is a necessary factor in the evolution of 
SARS. In addition, all five pairs of acute and convalescent sera tested from patients from 
other hospitals in Hong Kong also showed seroconversion to the virus. The five patients 

25 who has not shown serological or virological evidence of hSARS virus infection, need to 
have later convalescent sera tested to define if they are also seroconverted. However, the 
concordance of the hS ARS virus with the clinical definition of SARS appears remarkable, 
given that clinical case definitions are never perfect. 

No evidence of HMPV infection, either by RT-PCR or rising antibody titer against 

30 HMPV, was detected in any of these patients. No other pathogen was consistently detected 
in our group of patients with SARS. It is therefore highly likely that that this hSARS virus 
is either the cause of SARS or a necessary pre-requisite for disease progression. Whether or 
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not other microbial or other co-factors play a role in progression of the disease remains to 
be investigated. 

The family Coronaviridae includes the genus Coronavirus and Torovirus. They are 
enveloped RNA viruses which cause disease in humans and animals. The previously 
5 known human coronaviruses, types 229E and OC43 are the major causes of the common 
cold (Holmes KV, Coronaviruses. Eds Knipe DM, Howley PM Fields Virology, 4th 
Edition, Lippincott Williams & Wilkins, Philadelphia, pp. 1187-1203). But, while they can 
occasionally cause pneumonia in older adults, neonates or immunocompromised patient 
(El-Sahly HM, Atmar RL, Glezen WP, Greenberg SB. Spectrum of clinical illness in 

10 hospitalizied patients with "common cold' 5 virus infections. Clin Infect Dis. 2000; 31: 96- 
100; and Foltz EJ, Elkordy MA. Coronavirus pneumonia following autologous bone 
marrow transplantation for breast cancer. Chest 1999; 115: 901-905), Coronaviruses have 
been reported to be an important cause of pneumonia in military recruits, accounting for up 
to 30% of cases in some studies (Wenzel RP, Hendley JO, Davies J A, Gwaltney JM, 

15 Coronavirus infections in military recruits: Three-year study with coronavirus strains OC43 
and 229E. Am Rev Respir Dis. 1974; 109: 621-624). Human coronaviruses can infect 
neurons and viral RNA has been detected in the brain of patients with multiple sclerosis 
(Talbot PJ, Cote G, Arbour N. Human coronavirus OC43 and 229E persistence in neural 
cell cultures and human brains. Adv Exp Med Biol.- in press). On the other hand, a number 

20 of animal coronaviruses (eg. Porcine Transmissible Gastroenteritis Virus, Murine Hepatitis 
Virus, Avian Infectious Bronchititis Virus) cause respiratory, gastrointestinal, neurological 
or hepatic disease in their respective hosts (Mcintosh K. Coronaviruses: a comparative 
review. Current Top Microbiol Immunol. 1974; 63: 85-112). 

We describe for the first time the clinical presentation and complications of SARS. 

25 Less than 25% of patients with coronaviral pneumonia had upper respiratory tract 

symptoms. As expected in atypical pneumonia, both respiratory symptoms and positive 
auscultatory findings were very disproportional to the chest radiographic findings. 
Gastrointestinal symptoms were present in 10%. It is relevant that the virus RNA is detected 
in faeces of some patients and that coronaviruses have been associated with diarrhoea in 

30 animals and humans (Caul EO, Egglestone SI. Further studies on human enteric 

coronaviruses Arch Virol 1977; 54: 107-17). The high incidence of deranged liver function 
test, leucopenia, significant lymphopenia, thrombocytopenia and subsequent evolution into 
adult respiratory distress syndrome suggests a severe systemic inflammatory damage 
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induced by this hSARS virus. Thus immuno-modulation by steroid may be important to 
complement the antiviral therapy by ribavirin. In this regard, it is pertinent that severe 
human disease associated with the avian influenza subtype H5N1, another virus that 
recently crossed from animals to humans, has also been postulated to have an immuno- 
5 pathological component (Cheung CY, Poon LLM, Lau ASY et al. Induction of 

proinflammatory cj^tokines in human macrophages by influenza A (H5N1) viruses: a 
mechanism for the unusual severity of human disease. Lancet 2002; 360: 183 1-1837). In 
common with H5N1 disease, patients with severe SARS are adults, are significantly more 
lymphopenic and have parameters of organ dysfunction beyond the respiratory tract (Table 

10 4) (Yuen KY, Chan PKS, Peiris JSM, et al Clinical features and rapid viral diagnosis of 

human disease associated with avian influenza A H5N1 virus. Lancet 1 998; 351: 467-471). 
It is important to note that a window of opportunity of around 8 days exists from the onset 
of symptoms to respiratory failure. Severe complicated cases are strongly associated with 
both underlying disease and delayed use of ribavirin and steroid therapy. Following our 

15 clinical experience in the initial cases, this combination therapy was started very early in 
subsequent cases which were largely uncomplicated cases at the time of admission. The 
overall mortality at the time of writing is only 2% with this treatment regimen. There were 
still 8 out of 19 complicated cases who had not shown significant response. It is not 
possible to a detail analysis of the therapeutic response to this combination regimen due to 

20 the heterogeneous dosing and time of initiation of therapy. 

Other factors associated with severe disease is acquisition of the disease through 
household contact which may be attributed to a higher dose or duration of viral exposure 
and the presence of underlying diseases. 

The clinical description reported here pertains largely to the more severe cases 

25 admitted to hospital. We presently have no data on the fiill clinical spectrum of the 
emerging Coronaviridae infection in the community or in an out-patient-setting. The 
availability of diagnostic tests as described here will help address these questions. In 
addition, it will allow questions pertaining to the period of virus shedding (and 
communicability) during convalescence, the presence of virus in other body fluids and 

30 excreta and the presence of virus shedding during the incubation period, to be addressed. 

The epidemiological data at present appears to indicate that the virus is spread by 
droplets or by direct and indirect contact although airborne spread cannot be ruled out in 
some instances. The finding of infectious virus in the respiratory tract supports this 
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contention. Preliminary evidence also suggests that the virus may be shed in the feces. 
However, it is important to note that detection of viral RNA does not prove that the virus is 
viable or transmissible. If viable virus is detectable in the feces, this would be a potentially 
additional route of transmission that needs to be considered. It is relevant to note that a 
number of animal coronaviruses are spread via the fecal-oral route (Mcintosh K. 
Coronaviruses: a comparative review. Current Top Microbiol Immunol 1974; 63: 85-112). 

7. DEPOSIT 

A sample of isolated hSARS virus was deposited with China Center for Type 
Culture Collection (CCTCC) at Wuhan University, Wuhan 430072 in China on April 2, 
2003 in accordance with the Budapest Treaty on the Deposit of Microorganisms, and 
accorded accession No. CCTCC- V2003 03, which is incorporated herein by reference in its 
entirety. 

8. MARKET POTENTIAL 

The hSARS virus can now be grown on a large scale, which allows the development 
of various diagnostic tests as described hereinabove as well as the development of vaccines 
and antiviral agents that are effective in preventing, ameliorating or treating SARS. Given 
the severity of the disease and its rapid global spread, it is highly likely that significant 
demands for diagnostic tests, therapies and vaccines to battle against the disease, will arise 
on a global scale. In addition, this virus contains genetic information which is extremely 
important and valuable for clinical and scientific research applications. 

9. EQUIVALENTS 

Those skilled in the art will recognize, or be able to ascertain many equivalents to 
the specific embodiments of the invention described herein using no more than routine 
experimentation. Such equivalents are intended to be encompassed by the following claims. 

All publications, patents and patent applications mentioned in this specification are 
herein incorporated by reference into the specification to the same extent as if each 
individual publication, patent or patent application was specifically and individually 
indicated to be incorporated herein by reference. 
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WHAT IS CLAIMED: 

1. An isolated nucleic acid molecule consisting essentially of a nucleic acid sequence 
of SEQ ID NO:2471, or a complement thereof. 

2. An isolated nucleic acid molecule consisting essentially of a nucleic acid sequence 
of SEQ ID NO:2473, or a complement thereof 

3. An isolated nucleic acid molecule which hybridizes under stringent conditions to a 
nucleic acid molecule having the nucleotide sequence of claim 1 or 2, or a complement 
thereof 

4. The nucleic acid molecule of claim 1 3 2 or 3 wherein the molecule is RNA. 

5. The nucleic acid molecule of claim 1, 2 or 3 wherein the molecule is DNA. 

6. An isolated nucleic acid molecule which hybridizes under stringent conditions to the 
nucleic acid molecule of claim 1 or 2, or a complement thereof, wherein the nucleic acid 
molecule encodes an amino acid sequence which has a biological activity exhibited by a 
polypeptide encoded by the nucleotide sequence of SEQ ID NO:2471 or 2473. 

7. An isolated polypeptide encoded by the nucleic acid molecule of claim 1 or 2. 

8. An antibody or an antigen-binding fragment thereof which immunospecifically 
binds to the N-gene protein of a hSARS virus. 

9. An antibody or an antigen-binding fragment thereof which immunospecifically 
binds to the S-gene protein of a hSARS virus. 

10. The antibody of claim 8, 9, or an antigen-binding fragment thereof which neutralizes 
the hSARS virus. 

11. An antibody which immunospecifically binds to the polypeptide of claim 7, or an 
antigen-binding fragment of said antibody. 

12. A method for detecting the presence of a N-gene of the hSARS virus in a biological 
sample, said method comprising: 
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(a) contacting the sample with a compound that selectively binds to said N- 
gene; and 

(b) detecting whether the compound binds to said N-gene in the sample. 

13. The method of claim 12, wherein the compound that binds to said N-gene is a 
nucleic acid molecule comprising a nucleotide sequence having at least 5, 10, 15, 20, 25, 30, 
35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 
700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150 or 1,200 contiguous nucleotides of 
the nucleotide sequence of SEQ ID NO: 2471, or a complement thereof. 

14. The method of claim 12, wherein the compound that binds to said N-gene is a 
nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO:2475, 2476, 2480 
and/or 2481. 

15. A method for detecting the presence of a S-gene of the hSARS virus in a biological 
sample, said method comprising: 

(a) contacting the sample with a compound that selectively binds to said S- 
gene; and 

(b) detecting whether the compound binds to said S-gene in the sample. 

16. The method of claim 15, wherein the compound that binds to said S-gene is a 
nucleic acid molecule comprising a nucleotide sequence having at least 5, 10, 15, 20, 25, 30, 
35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 
700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, or 3,000 contiguous 
nucleotides of the nucleotide sequence of SEQ ED NO:2473, or a complement thereof. 

17. The method of claim 15, wherein the compound that binds to said S-gene is a 
nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO:2477 and/or 2478. 

18. A method for detecting the presence of the polypeptide of claim 7 in a sample, said 
method comprising: 

(a) contacting the sample with a compound that selectively binds to said 
polypeptide; and 
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(b) detecting whether the compound binds to said polypeptide in the sample. 

1 9. The method of claim 1 8, wherein the compound that binds to the polypeptide is an 
antibody. 

20. A method for identifying a subject infected with the hSARS virus, said method 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) subjecting the cDNA to real-time PGR assay using a set of primers 
derived from a nucleotide sequence of the N-gene of the hSARS. 

21. The method of claim 20, wherein the set of primers have nucleotide sequences of 
SEQ ID NOS:2475 and 2416, respectively. 

22. The method of claim 20, wherein the set of primers have nucleotide sequences of 
SEQ IDNOS:2480 and 2481, respectively. 

23 . A method for identifying a subject infected with the hS ARS virus, said method 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) subjecting the cDNA to real-time PGR assay using a set of primers 
derived from a nucleotide sequence of the S-gene of the hSARS. 

24. The method of claim 23, wherein the set of primers have nucleotide sequences of 
SEQ ID NOS:2477 and 2478, respectively. 

25. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence of SEQ ED NO: 247 5 and/or SEQ ED NO: 2476. 

26. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence of SEQ ED NO:2480 and/or SEQ ID NO:2481. 
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27. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence of SEQ ID NO:2477 and/or SEQ ID NO:2478. 

28. A kit comprising in one or more containers one or more antibodies of claim 8 or 9. 

29. A kit comprising in one or more containers one or more antibodies of claim 11. 
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Leu Gin Phe Thr Ser Leu Glu lie Pro Arg Arg Asn Val Ala Thr Leu 
260 265 270 

caa gca gaa aat gta act gga ctt ttt aag gac tgt agt aag ate att 8 65 
Gin Ala Glu Asn Val Thr Gly Leu Phe Lys Asp Cys Ser Lys lie lie 
275 280 285 

act ggt ctt cat cct aca cag gca cct aca cac etc age gtt gat ata 913 
Thr Gly Leu His Pro Thr Gin Ala Pro Thr His Leu Ser Val Asp lie 
290 295 300 

aaa ttc aag act gaa gga tta tgt gtt gac ata cca ggc ata cca aag 961 
Lys Phe Lys Thr Glu Gly Leu Cys Val Asp lie Pro Gly He Pro Lys 
305 310 315 ~ 320 

gac atg acc tac cgt aga etc ate tct atg atg ggt ttc aaa atg aat 1009 
Asp Met Thr Tyr Arg Arg Leu He Ser Met Met Gly Phe Lys Met Asn 
325 330 335 

tac caa gtc aat ggt tac cct aat atg ttt ate ace cgc gaa gaa get 1057 
Tyr Gin Val Asn Gly Tyr Pro Asn Met Phe He Thr Arg Glu Glu Ala 
340 345 350 

att cgt cac gtt cgt gcg tgg att ggc ttt gat gta gag ggc tgt cat 1105 
He Arg His Val Arg Ala Trp He Gly Phe Asp Val Glu Gly Cys His 
355 360 365 

gca act aga gat get gtg ggt act aac eta cct etc cag eta gga ttt 1153 
Ala Thr Arg Asp Ala Val Gly Thr Asn Leu Pro Leu Gin Leu Gly Phe 
370 375 380 

tct aca ggt gtt aac tta gta get gta ccg act ggt tat gtt gac act 1201 
Ser Thr Gly Val Asn Leu Val Ala Val Pro Thr Gly Tyr Val Asp Thr 
385 390 395 400 

gaa aat aac eta 1213 

Glu Asn Asn Leu 
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c aga acc atg cct aac atg ctt agg ata atg gcc tct ctt gtt ctt get 4 9 
Arg Thr Met Pro Asn Met Leu Arg He Met Ala Ser Leu Val Leu Ala 
15 10 15 

cgc aaa cat aac act tgc tgt aac tta tea cac cgt ttc tac agg tta 97 
Arg Lys His Asn Thr Cys Cys Asn Leu Ser His Arg Phe Tyr Arg Leu 
20 25 30 

get aac gag tgt gcg caa gta tta agt gag atg gtc atg tgt ggc ggc 145 
Ala Asn Glu Cys Ala Gin Val Leu Ser Glu Met Val Met Cys Gly Gly 
35 40 45 

tea eta tat gtt aaa cca ggt gga aca tea tec ggt gat get aca act 193 
Ser Leu Tyr Val Lys Pro Gly Gly Thr Ser Ser Gly Asp Ala Thr Thr 
50 55 60 

get tat get aat agt gtc ttt aac att tgt caa get gtt aca gcc aat 241 
Ala Tyr Ala Asn Ser Val Phe Asn He Cys Gin Ala Val Thr Ala Asn 
65 70 75 80 

gta aat gca ctt ctt tea act gat ggt aat aag ata get gac aag tat 28 9 
Val Asn Ala Leu Leu Ser Thr Asp Gly Asn Lys He Ala Asp Lys Tyr 
85 90 * 95 

gtc cgc aat eta caa cac agg etc tat gag tgt etc tat aga aat agg 337 
Val Arg Asn Leu Gin His Arg Leu Tyr Glu Cys Leu Tyr Arg Asn Arg 
100 105 110 

gat gtt gat cat gaa ttc gtg gat gag ttt tac get tac ctg cgt aaa 385 
Asp Val Asp His Glu Phe Val Asp Glu Phe Tyr Ala Tyr Leu Arg Lys 
115 120 125 

cat ttc tec atg atg att ctt tct gat gat gcc gtt gtg tgc tat aac 4 33 
His Phe Ser Met Met He Leu Ser Asp Asp Ala Val Val Cys Tyr Asn 
130 135 140 

agt aac tat gcg get caa ggt tta gta get age att aag aac ttt aag 481 
Ser Asn Tyr Ala Ala Gin Gly Leu Val Ala Ser He Lys Asn Phe Lys 
145 150 155 160 

gca gtt ctt tat tat caa aat aat gtg ttc atg tct gag gca aaa tgt 529 
Ala val Leu Tyr Tyr Gin Asn Asn Val Phe Met Ser Glu Ala Lys Cys 
165 170 S 17 5 

tgg act gag act gac ctt act aaa gga cct cac gaa ttt tgc tea cag 577 
Trp Thr Glu Thr Asp Leu Thr Lys Gly Pro His Glu Phe Cys Ser Gin 
180 185 190 

cat aca atg eta gtt aaa caa gga gat gat tac gtg tac ctg cct tac 625 
His Thr Met Leu Val Lys Gin Gly Asp Asp Tyr Val Tyr Leu Pro Tyr 
195 200 205 

cca gat cca tea aga ata tta ggc gca ggc tgt ttt gtc gat gat att 673 
Pro Asp Pro Ser Arg He Leu Gly Ala Gly Cys Phe Val Asp Asp He 
210 215 220 

gtc aaa cag atg gta cac tta tga ttg aaa ggt tec gtg tea ctg get 721 
Val Lys Gin Met Val His Leu 
225 230 

att gat gc 729 
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1 atattaggtt tttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt 
61 ctctaaacga actttaaaat ctgtgtagct gtcgctcggc tgcatgccta gtgcacctac 
121 gcagtataaa caataataaa ttttactgtc gttgacaaga aacgagtaac tcgtccctct 
181 tctgcagact gcttacggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc 
241 gtccgggtgt gaccgaaagg taagatggag agccttgttc ttggtgtcaa cgagaaaaca 
301 cacgtccaac tcagtttgcc tgtccttcag gttagagacg tgctagtgcg tggcttcggg 
361 gactctgtgg aagaggccct atcggaggca cgtgaacacc tcaaaaatgg cacttgtggt 
421 ctagtagagc tggaaaaagg cgtactgccc cagcttgaac agccctatgt gttcattaaa 
4 81 cgttctgatg ccttaagcac caatcacggc cacaaggtcg ttgagctggt tgcagaaatg 
541 gacggcattc agtacggtcg tagcggtata acactgggag tactcgtgcc acatgtgggc 
601 gaaaccccaa ttgcataccg caatgttctt cttcgtaaga acggtaataa gggagccggt 
661 ggtcatagct atggcatcga tctaaagtct tatgacttag gtgacgagct tggcactgat 
7 21 cccattgaag attatgaaca aaactggaac actaagcatg gcagtggtgc actccgtgaa 
781 ctcactcgtg agctcaatgg aggtgcagtc actcgctatg tcgacaacaa tttctgtggc 
841 ccagatgggt accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg 
901 tgcactcttt ccgaacaact tgattacatc gagtcgaaga gaggtgtcta ctgctgccgt 
961 gaccatgagc atgaaattgc ctggttcact gagcgctctg ataagagcta cgagcaccag 
1021 acacccttcg aaattaagag tgccaagaaa tttgacactt tcaaagggga atgcccaaag 
1081 tttgtgtttc ctcttaactc aaaagtcaaa gtcattcaac cacgtgttga aaagaaaaag 
1141 actgagggtt tcatggggcg tatacgctct gtgtaccctg ttgcatctcc acaggagtgt 
1201 aacaatatgc acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt ttcatggcag 
12 61 acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa 
1321 ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc atgtcctgcc 
1381 tgtcaagacc cagagattgg acctgagcat agtgttgcag attatcacaa ccactcaaac 
1441 attgaaactc gactccgcaa gggaggtagg actagatgtt ttggaggctg tgtgtttgcc 
1501 tatgttggct gctataataa gcgtgcctac tgggttcctc gtgctagtgc tgatattggc 
1561 tcaggccafca ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag 
1621 atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag 
1681 gttgccatca ttttggcatc tttctctgct tctacaagtg cctttattga cactataaag 
1741 agtcttgatt acaagtcttt caaaaccatt gttgagtcct gcggtaacta taaagttacc 
1801 aagggaaagc ccgtaaaagg tgcttggaac attggacaac agagatcagt tttaacacca 
18 61 ctgtgtggtt ttccctcaca ggctgctggt gttatcagat caatttttgc gcgcacactt 
1921 gatgcagcaa accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt 
1981 atttctgaac agtcattacg tcttgtcgac gccatggttt atacttcaga cctgctcacc 
2041 aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac ttctcagtgg 
2101 ttgtctaatc ttttgggcac tactgttgaa aaactcaggc ctatctttga atggattgag 
2161 gcgaaactta gtgcaggagt tgaatttctc aaggatgctt gggagattct caaatttctc 
2221 attacaggtg tttttgacat cgtcaagggt caaatacagg ttgcttcaga taacatcaag 
2281 gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac tcgaaatgtg cattgatcaa 
2341 gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa 
24 01 agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact actcatgcct 
24 61 cttaaggcac caaaagaagt aacctttctt gaaggtgatt cacatgacac agtacttacc 
2521 tctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac tcgagacgcc cgttgatagc 
2581 ttcacaaatg gagctatcgt cggcacacca gtctgtgtaa atggcctcat gctcttagag 
2 641 attaaggaca aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc 
2701 tttcgcttaa aagggggtgc accaattaaa ggtgtaacct ttggagaaga tactgtttgg 
27 61 gaagttcaag gttacaagaa tgtgagaatc acatttgagc ttgatgaacg tgttgacaaa 
2821 gtgcttaatg aaaagtgctc tgtctacact gttgaatccg gtaccgaagt tactgagttt 
2 8 81 gcatgtgttg tagcagaggc tgttgtgaag actttacaac cagtttctga tctccttacc 
2941 aacatgggta ttgatcttga tgagtggagt gtagctacat tctacttatt tgatgatgct 
3001 ggtgaagaaa acttttcatc acgtatgtat tgttcctttt accctccaga tgaggaagaa 
3061 gaggacgatg cagagtgtga ggaagaagaa attgatgaaa cctgtgaaca tgagtacggt 
3121 acagaggatg attatcaagg tctccctctg gaatttggtg cctcagctga aacagttcga 
3181 gttgaggaag aagaagagga agactggctg gatgatacta ctgagcaatc agagattgag 
3241 ccagaaccag aacctacacc tgaagaacca gttaatcagt ttactggtta tttaaaactt 
3301 actgacaatg ttgccattaa atgtgttgac atcgttaagg aggcacaaag tgctaatcct 
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3361 atggtgattg taaatgctgc taacatacac ctgaaacatg gtggtggtgt agcaggtgca 
3421 ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat 
3481 ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc taagaagtgt 
3541 ctgcatgttg ttggacctaa cctaaatgca ggtgaggaca tccagcttct taaggcagca 
3601 tatgaaaatt tcaattcaca ggacatctta cttgcaccat tgttgtcagc aggcatattt 
3661 ggtgctaaac cacttcagtc tttacaagtg tgcgtgcaga cggttcgtac acaggtttat 
3721 attgcagtca atgacaaagc tctttatgag caggttgtca tggattatct tgataacctg 
3781 aagcctagag tggaagcacc taaacaagag gagccaccaa acacagaaga ttccaaaact 
3841 gaggagaaat ctgtcgtaca gaagcctgtc gatgtgaagc caaaaattaa ggcctgcatt 
3901 gatgaggtta ccacaacact ggaagaaact aagtttctta ccaataagtt actcttgttt 
3961 gctgatatca atggtaagct ttaccatgat tctcagaaca tgcttagagg tgaagatatg 
4021 tctttccttg agaaggatgc accttacatg gtaggtgatg ttatcactag tggtgatatc 
4081 actngtgttg taataccctc caaaaaggct ggtggcacta ctgagatgct ctcaagagct 
4141 ttgaagaaag tgccagttga tgagtatata accacgtacc ctggacaagg atgtgctggt 
4201 tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt ttatgtacta 
4261 ccttcagaag cacctaatgc taaggaagag attctaggaa ctgtatcctg gaatttgaga 
4321 gaaatgcttg ctcatgctga agagacaaga aaattaatgc ctatatgcat ggatgttaga 
4381 gccataatgg caaccatcca acgtaagtat aaaggaatta aaattcaaga gggcatcgtt 
4441 gactatggtg tccgattctt cttttatact agtaaagagc ctgtagcttc tattattacg 
4501 aagctgaact ctctaaatga gccgcttgtc acaatgccaa ttggttatgt gacacatggt 
4561 tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgc cgtagtgtca 
4 621 gtatcatcac cagatgctgt tactacatat aatggatacc tcacttcgtc atcaaagaca 
4 681 tctgaggagc actttgtaga aacagtttct ttggctggct cttacagaga ttggtcctat 
4741 tcaggacagc gtacagagtt aggtgttgaa tttcttaagc gtggtgacaa aattgtgtac 
4 801 cacactctgg agagccccgt cgagtttcat cttgacggtg aggttctttc acttgacaaa 
4861 ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac 
4921 aacactaatc tccacacaca gcttgtggat atgtctatga catatggaca gcagtttggt 
4981 ccaacatact tggatggtgc tgatgttaca aaaattaaac ctcatgtaaa tcatgagggt 
5041 aagactttct ttgtactacc tagtgatgac acactacgta gtgaagcttt cgagtactac 
5101 catactcttg atgagagttt tcttggtagg tacatgtctg ctttaaacca cacaaagaaa 
5161 tggaaatttc ctcaagttgg tggtttaact tcaattaaat gggcfcgataa caattgttat 
5221 ttgtctagtg ttttattagc acttcaacag cttgaagtca aattcaatgc accagcactt 
5281 caagaggctt attatagagc ccgtgctggt gatgctgcta acttttgtgc actcatactc 
5341 gcttacagta ataaaactgt tggcgagctt ggtgatgtca gagaaactat gacccatctt 
5401 ctacagcatg ctaatttgga atctgcaaag cgagttctta atgtggtgtg taaacattgt 
54 61 ggtcagaaaa ctactacctt aacgggtgta gaagctgtga tgtatatggg tactctatct 
5521 tatgataatc ttaagacagg tgtttccatt ccatgtgtgt gtggtcgtga tgctacacaa 
5581 tatctagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa 
5641 ttacagcaag gtacattctt atgtgcgaat gagtacactg gtaactatca gtgtggtcat 
57 01 tacactcata taactgctaa ggagaccctc tatcgtattg acggagctca ccttacaaag 
5761 atgtcagagt acaaaggacc agtgactgat gttttctaca aggaaacatc ttacactaca 
5821 accahcaagc ctgtgtcgta taaactcgat ggagttactfc acacagagat tgaaccaaaa 
5881 ttggatgggt attataaaaa ggataatgct tactatacag agcagcctat agaccttgta 
5941 ccaactcaac cattaccaaa tgcgagtttt gataatttca aactcacatg ttctaacaca 
6001 aaatttgctg atgatttaaa tcaaatgaca ggcttcacaa agccagcttc acgagagcta 
6061 tctgtcacat tcttcccaga cttgaatggc gatgtagtgg ctattgacta tagacactat 
6121 tcagcgagtt tcaagaaagg tgctaaatta ctgcataagc caattgtttg gcacattaac 
6181 caggctacaa ccaagacaac gttcaaacca aacacttggt gtttacgttg tctttggagt 
6241 acaaagccag tagatacttc aaattcattt gaagttctgg cagtagaaga cacacaagga 
6301 atggacaatc ttgcttgtga aagtcaacaa cccacctctg aagaagtagt ggaaaatcct 
6361 accatacaga aggaagtcat agagtgtgac gtgaaaacta ccgaagttgt aggcaatgtc 
6421 atacttaaac catcagatga aggtgttaaa gtaacacaag agttaggtca tgaggatctt 
6481 atggctgctt atgtggaaaa cacaagcatt accattaaga aacctaatga gctttcacta 
6541 gccttaggtt taaaaacaat tgccactcat ggtattgctg caattaatag fcgttccttgg 
6601 agtaaaattt tggcttatgt caaaccattc ttaggacaag cagcaattac aacatcaaat 
6661 tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta 
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6721 ttgttccaat tgtgtacttt tactaaaagt accaattcta gaattagagc ttcactacct 
6781 acaactattg ctaaaaatag tgttaagagt gttgctaaat tatgtttgga tgccggcatt 
6841 aattatgtga agtcacccaa attttctaaa ttgttcacaa tcgctatgtg gctattgttg 
6901 ttaagtattt gcttaggttc tctaatctgt gtaactgctg cttttggtgt actcttatct 
6961 aattttggtg ctccttctta ttgtaatggc gttagagaat tgtatcttaa ttcgtctaac 
7021 gttactacta tggatttctg tgaaggttct tttccttgca gcatttgttt aagtggatta 
7081 gactcccttg attcttatcc agctcttgaa accattcagg tgacgatttc atcgtacaag 
7141 ctagacttga caattttagg tctggccgct gagtgggttt tggcatatat gttgttcaca 
7201 aaattctttt atttattagg tctttcagct ataatgcagg tgttctttgg ctattttgct 
72 61 agtcatttca tcagcaattc ttggctcatg tggtttatca ttagtattgt acaaatggca 
7321 cccgtttctg caatggttag gatgtacatc ttctttgctt ctttctacta catatggaag 
7381 agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc 
7441 aatcgtgcca cacgcgttga gtgtacaact attgttaatg gcatgaagag atctttctat 
7501 gtctatgcaa atggaggccg tggcttctgc aagactcaca attggaattg tctcaattgt 
7561 gacacatttt gcactggtag tacattcatt agtgatgaag ttgctcgtga tttgtcactc 
7621 cagtttaaaa gaccaatcaa ccctactgac cagtcatcgt atattgttga tagtgttgct 
7 681 gtgaaaaatg gcgcgcttca cctctacttt gacaaggctg gtcaaaagac ctatgagaga 
7741 catccgctct cccattttgt caatttagac aatttgagag ctaacaacac taaaggttca 
7801 ctgcctatta atgtcatagt ttttgatggc aagtccaaat gcgacgagtc tgcttctaag 
7861 tctgcttctg tgtactacag tcagctgatg tgccaaccta ttctgttgct tgaccaagct 
7 921 cttgtatoaa acgttggaga tagtactgaa gtttccgtta agatgtttga tgcttatgtc 
7981 gacacctttt cagcaacttt tagtgttcct atggaaaaac ttaaggcact tgttgctaca 
8041 gctcacagcg agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca 
3101 gctgcccgac aaggtgttgt tgataccgat gttgacacaa aggatgttat tgaatgtctc 
8161 aaactttcac atcaotctga cttagaagtg acaggtgaca gttgtaacaa tttcatgctc 
8221 acctataata aggttgaaaa catgacgccc agagatcttg gcgcatgtat tgactgtaat 
82 81 gcaaggcata tcaatgccca agtagcaaaa agtcacaatg tttcactcat ctggaatgta 
8341 aaagactaca tgtctttatc tgaacagctg cgtaaacaaa ttcgtactgc tgccaagaag 
8401 aacaacatac cttttacact aacttgtgct acaactagac aggttgtcaa tgtcataact 
84 61 actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag 
8521 gccacattat tgtgcgttct tgctgcattg gtttgttata tcgttatgcc agtacataca 
8581 ttgtcaatcc atgatggtta cacaaatgaa atcattggtt acaaagccat tcaggatggt 
8641 gtcactcgtg acatcatttc tactgatgat tgttttgcaa ataaacatgc tggttttgac 
8701 gcatggttta gccagcgtgg tggttcatac aaaaatgaca aaagctgccc tgtagtagct 
87 61 gctatcatta caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga 
8821 gcaatcaatg gtgacttctt gcattttcta cctcgtgttt ttagtgctgt tggcaacatt 
8881 tgctacacac cttccaaact cattgagtat agtgattttg ctacctctgc ttgcgttctt 
8941 gctgctgagt gtacaatttt taaggatgct atgggcaaac ctgtgccata ttgttatgac 
9001 actaatttgc tagagggttc tatttcttat agtgagcttc gtccagacac tcgttatgtg 
9061 cttatggatg gttccatcat acagtttcct aacacttacc tggagggttc tgfctagagta 
9121 gtaacaactt ttgatgctga gtactgtaga catggtacat gcgaaaggtc agaagtaggt 
9181 atttgcctat ctaccagtgg tagatgggtt cttaataatg agcattacag agctctatca 
9241 ggagttttct gtggtgfctga tgcgatgaat ctcatagcta acatctttac tcctcttgtg 
9301 caacctgtgg gtgctttaga tgtgtctgct tcagtagtgg ctggtggtat tattgccata 
9361 ttggtgactt gtgctgccta ctactttatg aaattcagac gtgtttttgg tgagtacaac 
9421 catgttgttg ctgctaatgc acttttgttt ttgatgtctt tcactatact ctgtctggta 
9481 ccagcttaca gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat 
9541 ttcaccaatg atgtttcatt cttggctcac cttcaatggt ttgccatgtt ttctcctatt 
9601 gtgccttttt ggataacagc aatctatgta ttctgtattt ctctgaagca ctgccattgg 
9661 ttctttaaca actatcttag gaaaagagtc atgtttaatg gagttacatt tagtaccttc 
9721 gaggaggctg ctttgtgtac ctttttgctc aacaaggaaa tgtacctaaa attgcgtagc 
9781 gagacactgt tgccacttac acagtataac aggtatcttg ctctatataa caagtacaag 
9841 tatttcagtg gagccttaga tactaccagc tatcgtgaag cagcttgctg ccacttagca 
9901 aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc accacagaca 
9961 tcaatcactt ctgctgttct gcagagtggt tttaggaaaa tggcattccc gtcaggcaaa 
10021 gttgaagggt gcatggtaca agtaacctgt ggaactacaa ctcttaatgg attgtggttg 
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10081 gatgacacag tatactgtcc aagacatgtc atttgcacag cagaagacat gcttaatcct 
10141 aactatgaag atctgctcat tcgcaaatcc aaccatagct ttcttgttca ggctggcaat 
10201 gttcaacttc gtgttattgg ccattctatg caaaattgtc tgcttaggct taaagttgat 
10261 acttctaacc ctaagacacc caagtataaa tttgtccgta tccaacctgg tcaaacattt 
10321 tcagttctag catgctacaa tggttcacca tctggtgttt atcagtgtgc catgagacct 
10381 aatcatacca ttaaaggttc tttccttaat ggatcatgtg gtagtgttgg ttttaacatt 
104 41 gattatgatt gcgtgtcttt ctgctatatg catcatatgg agcttccaac aggagtacac 
10501 gctggtactg acttagaagg taaattctat ggtccatttg ttgacagaca aactgcacag 
10561 gctgcaggta cagacacaac cataacatta aatgttttgg catggctgta tgctgctgtt 
10621 atcaatggtg ataggtggtt tcttaataga ttcaccacta ctttgaatga ctttaacctt 
10681 gtggcaatga agtacaacta tgaacctttg acacaagatc atgttgacat attgggacct 
10741 ctttctgctc aaacaggaat tgccgtctta gatatgtgtg ctgctttgaa agagctgctg 
10801 cagaatggta tgaatggtcg tactatcctt ggtagcacta ttttagaaga tgagtttaca 
108 61 ccatttgatg ttgttagaca atgctctggt gttaccttcc aaggtaagtt caagaaaatt 
10921 gttaagggca ctcatcattg gatgctttta actttcttga catcactatt gattcttgtt 
10981 caaagtacac agtggtcact gtttttcttt gtttacgaga atgctttctt gccatttact 
11041 cttggtatta tggcaattgc tgcatgtgct atgctgcttg ttaagcataa gcacgcattc 
11101 ttgtgcttgt ttctgttacc ttctcttgca acagttgctt actttaatat ggtctacatg 
11161 cctgctagct gggtgatgcg tatcatgaca tggcttgaat tggctgacac tagcttgtct 
11221 ggttataggc ttaaggattg tgttatgtat gcttcagctt tagttttgct tattctcatg 
11281 acagctcgca ctgtttatga tgatgctgct agacgtgttt ggacactgat gaatgtcatt 
11341 acacttgttt acaaagtcta ctatggtaat gctttagatc aagctatttc catgtgggcc 
11401 ttagttattt ctgtaacctc taactattct ggtgtcgtta cgactatcat gtttttagct 
11461 agagctatag tgtttgtgtg tgttgagtat tacccattgt tatttattac tggcaacacc 
11521 ttacagtgta tcatgcttgt ttattgtttc ttaggctatt gttgctgctg ctactttggc 
11581 cttttctgtt tacrcaaccg ttacttcagg cttactcttg gtgtttatga ctacttggtc 
11641 tctacacaag aatttaggta tatgaactcc caggggcttt tgcctcctaa gagtagtatt 
11701 gatgctttca agcttaacat taagttgttg ggtattggag gtaaaccatg tatcaaggtt 
11761 gctactgtac agtctaaaat gtctgacgta aagtgcacat ctgtggtact gctctcggtt 
11821 cttcaacaac ttagagtaga gtcatcttct aaattgtggg cacaatgtgt acaactccac 
11881 aatgatattc ttcttgcaaa agacacaact gaagctttcg agaagatggt ttctcttttg 
11941 tctgttttgc tatccatgca gggtgctgta gacattaata ggttgtgcga ggaaatgctc 
12001 gataaccgtg ctactcttca ggctattgct tcagaattta gttctttacc atcatatgcc 
12061 gcttatgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc 
12121 gttctcaaaa agttaaagaa atctttgaat gtggctaaat ctgagtttga ccgtgatgct 
12181 gccatgcaac gcaagttgga aaagatggca gatcaggcta tgacccaaat gtacaaacag 
12241 gcaagatctg aggacaagag ggcaaaagta actagtgcta tgcaaacaat gctcttcact 
12301 atgcttagga agcttgataa tgatgcactt aacaacatta tcaacaatgc gcgtgatggt 
12361 tgtgttccac tcaacatcat accattgact acagcagcca aactcatggt tgttgtccct 
12421 gattatggta cctacaagaa cacttgtgat ggtaacaact ttacatatgc atctgcactc 
12 481 tgggaaatcc agcaagttgt tgatgcggat agcaagattg ttcaacttag tgaaattaac 
12541 atggacaatt caccaaattt ggcttggcct cttattgtta cagctctaag agccaactca 
12601 gctgttaaac tacagaataa tgaactgagt ccagtagcac tacgacagat gtcctgtgcg 
12661 gctggtacca cacaaacagc ttgtactgat gacaatgcac ttgcctacta taacaattcg 
12721 aagggaggta ggtttgtgct ggcattacta tcagaccacc aagatctcaa atgggctaga 
12781 ttccctaaga gtgatggtac aggtacaatt tacacagaac tggaaccacc ttgtaggttt 
12841 gttacagaca caccaaaagg gcctaaagtg aaatacttgt acttcatcaa aggcttaaac 
12 901 aacctaaata gaggtatggt gctgggcagt ttagctgcta cagtacgtct tcaggctgga 
12961 aatgctacag aagtacctgc caattcaact gtgctttcct tctgtgcttt tgcagtagac 
13021 cctgctaaag catataagga ttacctagca agtggaggac aaccaatcac caactgtgtg 
13081 aagatgttgt gtacacacac tggtaqagga caggcaatta ctgtaacacc agaagctaac 
13141 atggaccaag agtcctctgg tggtgcttca tgttgtctgt attgtagatg ccacattgac 
13201 catccaaatc ctaaaggatt ctgtgacttg aaaggtaagt acgtccaaat acctaccact 
132 61 tgtgctaatg acccagtggg ttttacactt agaaacacag tctgtaccgt ctgcggaatg 
13321 tggaaaggtt atggctgtag ttgtgaccaa ctccgcgaac ccttgatgca gtctgcggat 
13381 gcatcaacgt ttttaaacgg gtttgcggtg taagtgcagc ccgtcttaca ccgtgcggca 
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13441 caggcactag tactgatgtc gtctacaggg cttttgatat ttacaacgaa aaaagtgctg 
13501 gttttgcaaa gttcctaaaa actaattgct gtcgcttcca ggagaaggat gaggaaggca 
13561 atttattaga ctcttacttt gtagttaaga ggcatactat gtctaactac caacatgaag 
13621 agactattta taacttggtt aaagattgtc cagcggttgc tgtccatgac tttttcaagt 
13681 ttagagtaga tggtgacatg gtaccacata tatcacgtca gcgtctaact aaatacacaa 
13741 tggctgattt agtctatgct ctacgtcatt ttgatgaggg taattgtgat acattaaaag 
13801 aaatactcgt cacatacaat tgctgtgatg atgattattt caataagaag gattggtatg 
13861 acttcgtaga gaatcctgac atcttacgcg tatatgctaa cttaggtgag cgtgtacgcc 
13921 aatcattatt aaagactgta caattctgcg atgctatgcg tgatgcaggc attgtaggcg 
13981 tactgacatt agataatcag gatcttaatg ggaactggta cgatttcggt gatttcgtac 
14 041 aagtagcacc aggctgcgga gttcctattg tggattcata ttactcattg ctgatgccca 
14101 tcctcacttt gactagggca ttggctgctg agtcccatat ggatgctgat ctcgcaaaac 
14161 cacttattaa gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg 
14221 accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg 
14281 ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg tttccaccta 
14341 caagttttgg accactagta agaaaaatat ttgtagatgg tgttcctttt gttgtttcaa 
14401 ctggatacca ttttcgtgag ttaggagtcg tacataatca ggatgtaaac ttacatagct 
144 61 cgcgtctcag tttcaaggaa cttttagtgt atgctgctga tccagctatg catgcagctt 
14521 ctggcaattt attgctagat aaacgcacta catgcttttc agtagctgca ctaacaaaca 
14581 atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg 
14641 tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc 
14701 aggatggcaa cgctgctatc agtgattatg actattatcg ttataatctg ccaacaatgt 
147 61 gtgatatcag acaactccta ttcgtagttg aagttgttga taaatacttt gattgttacg 
14821 atggtggctg tattaatgcc aaccaagtaa tcgttaacaa tctggataaa tcagctggtt 
14881 tcccatttaa taaatggggt aaggctagac tttattatga ctcaatgagt tatgaggatc 
14941 aagatgcact tttcgcgtat actaagcgta atgtcatccc tactataact caaatgaatc 
15001 ttaagtatgc cattagtgca aagaatagag ctcgcaccgt agctggtgtc tctatctgta 
15061 gtactatgac aaatagacag tttcatcaga aattattgaa gtcaatagcc gccactagag 
15121 gagctactgt ggtaattgga acaagcaagt tttacggtgg ctggcataat atgttaaaaa 
15181 ctgtttacag tgatgtagaa actccacacc ttatgggttg ggattatcca aaatgtgaca 
15241 gagccatgcc taacatgctt aggataatgg cctctcttgt tcttgctcgc aaacataaca 
15301 cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg caagtattaa 

153 61 gtgagatggt catgtgtggc ggctcactat atgttaaacc aggtggaaca tcatccggtg 
15421 atgctacaac tgcttatgct aatagtgtct ttaacatttg tcaagctgtt acagccaatg 

154 81 taaatgcact tctttcaact gatggtaata agatagctga caagtatgtc cgcaatctac 
15541 aacacaggct ctatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg 
15601 agttttacgc ttacctgcgt aaacatttct ccatgatgat tctttctgat gatgccgttg 
15661 tgtgctataa cagtaactat gcggctcaag gtttagtagc tagcattaag aactttaagg 
15721 cagttcttta ttatcaaaat aatgtgttca tgtctgaggc aaaatgttgg actgagactg 
15781 accttactaa aggacctcac gaattttgct cacagcatac aatgctagtt aaacaaggag 
15841 atgattacgt gtacctgcct tacccagatc catcaagaat attaggcgca ggctgttttg 
15901 tcgatgatat tgtcaaaaca gatggtacac ttatgattga aaggttcgtg tcactggcta 
15961 ttgatgctta cccacttaca aaacatccta atcaggagta tgctgatgtc tttcacttgt 
16021 atttacaata cattagaaag ttacatgatg agcttactgg ccacatgttg gacatgtatt 
16081 ccgtaatgct aactaatgat aacacctcac ggtactggga acctgagttt tatgaggcta 
16141 tgtacacacc acatacagtc ttgcaggctg taggtgcttg tgtattgtgc aattcacaga 
162 01 cttcacttcg ttgcggtgcc tgtattagga gaccattcct atgttgcaag tgctgctatg 
162 61 accatgtcat ttcaacatca cacaaattag tgttgtctgt taatccctat gtttgcaatg 
16321 ccccaggttg tgatgtcact gatgtgacac aactgtatct aggaggtatg agctattatt 
16381 gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag gtttttggtt 
16441 tatacaaaaa cacatgtgta ggcagtgaca atgtcactga cttcaatgcg atagcaacat 
16501 gtgattggac taatgctggc gattacatac ttgccaacac ttgtactgag agactcaagc 
165 61 ttttcgcagc agaaacgctc aaagccactg aggaaacatt taagctgtca tatggtattg 
16621 ccactgtacg cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac 
16681 ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact aaaaatagta 
167 41 aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca 
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16801 gaggtactac gacatacaag ttgaatgttg gtgattactt tgtgttgaca tctcacactg 
16861 taatgccact tagtgcacct actctagtgc cacaagagca ctatgtgaga attactggct 
16921 tgtacccaac actcaacatc tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg 
16981 tcggcatgca aaagtactct acactccaag gaccacctgg tactggtaag agtcattttg 
17041 ccatcggact tgctctctat tacccatctg ctcgcatagt gtatacggca tgctctcatg 
17101 cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat aaatgtagta 
17161 gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa attcaaagtg aattcaacac 
17221 tagaacagta tgttttctgc actgtaaatg cattgccaga aacaactgct gacattgtag 
17281 tctttgatga aatctctatg gctactaatt atgacttgag tgttgtcaat gctagacttc 
17341 gtgcaaaaca ctacgtctat attggcgatc ctgctcaatt accagccccc cgcacattgc 
17401 tgactaaagg cacactagaa ccagaatatt ttaattcagt gtgcagactt atgaaaacaa 
174 61 taggtccaga catgttcctt ggaacttgtc gccgttgtcc tgctgaaatt gttgacactg 
17521 tgagtgcttt agtttatgac aataagctaa aagcacacaa ggataagtca gctcaatgct 
17581 tcaaaatgtt ctacaaaggt gttattacac atgatgtttc atctgcaatc aacagacctc 
17641 aaataggcgt tgtaagagaa tttcttacac gcaatcctgc ttggagaaaa gctgttttta 
17701 tctcacctta taattcacag aacgctgtag cttcaaaaat cttaggattg cctacgcaga 
177 61 ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa 
17821 cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca aaaattggca 
17881 ttttgtgcat aatgtctgat agagatcttt atgacaaact gcaatttaca agtctagaaa 
17941 taccacgtcg caatgtggct acattacaag cagaaaatgt aactggactt tttaaggact 
16001 gtagtaagat cattactggt cttcatccta cacaggcacc tacacacctc agcgttgata 
18061 taaaattcaa gactgaagga ttatgtgttg acataccagg cataccaaag gacatgacct 
18121 accgtagact catctctatg atgggtttca aaatgaatta ccaagtcaat ggttacccta 
18181 atatgtttat cacccgcgaa gaagctattc gtcacgttcg tgcgtggatt ggctttgatg 
18241 tagagggctg tcatgcaact agagatgctg tgggtactaa cctacctctc cagctaggat 
18301 tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact gaaaataaca 
183 61 cagaattcac cagagttaat gcaaaacctc caccaggtga ccagtttaaa catcttatac 
18421 cactcatgta taaaggcttg ccctggaatg tagtgcgtat taagatagta caaatgctca 
18481 gtgatacact gaaaggattg tcagacagag tcgtgttcgt cctttgggcg catggctttg 
18541 agcttacatc aatgaagtac tttgtcaaga ttggacctga aagaacgtgt tgtctgtgtg 
18601 acaaacgtgc aacttgcttt tctacttcat cagatactta tgcctgctgg aatcattctg 
18661 tgggttttga ctatgtctat aacccattta tgattgatgt tcagcagtgg ggctttacgg 
18721 gtaaccttca gagtaaccat gaccaacatt gccaggtaca tggaaatgca catgtggcta 
18781 gttgtgatgc tatcatgact agatgttfcag cagtccatga gtgctttgtt aagcgcgttg 
18841 attggtctgt tgaataccct attataggag atgaactgag ggttaattct gcttgcagaa 
18901 aagtacaaca catggttgtg aagtctgcat tgcttgctga taagtttcca gttcttcatg 
18 961 acattggaaa tccaaaggct atcaagtgtg tgcctcaggc tgaagtagaa tggaagttct 
19021 acgatgctca gccatgtagt gacaaagctt acaaaataga ggaactcttc tattcttatg 
19081 ctacacatca cgataaattc actgatggtg tttgtttgtt ttggaattgt aacgttgafcc 
19141 gttacccagc caatgcaatt gtgtgtaggt ttgacacaag agtcttgtca aacttgaact 
19201 taccaggctg tgatggtggt agtttgtatg tgaataagca tgcattccac actccagctt 
192 61 tcgataaaag tgcatttact aatttaaagc aattgccttt cttttactat tctgatagtc 
19321 cttgtgagtc tcatggcaaa caagtagtgt cggatattga ttatgttcca ctcaaatctg 
19381 ctacgtgtat tacacgatgc aatttaggtg gtgctgtttg cagacaccat gcaaatgagt 
19441 accgacagta cttggatgca tataatatga tgatttctgc tggatttagc ctatggattt 
19501 acaaacaatt tgatacttat aacctgtgga atacatttac caggttacag agtttagaaa 
195 61 atgtggctta taatgttgtt aataaaggac actttgatgg acacgccggc gaagcacctg 
19621 tttccatcat taataatgct gtttacacaa aggtagatgg tattgatgtg gagatctttg 
19681 aaaataagac aacacttcct gttaatgttg catttgagct ttgggctaag cgtaacatta 
19741 aaccagtgcc agagattaag atactcaata atttgggtgt tgatatcgct gctaatactg 
19801 taatctggga ctacaaaaga gaagccccag cacatgtatc tacaataggt gtctgcacaa 
198 61 tgactgacat tgccaagaaa cctactgaga gtgcttgttc ttcacttact gtcttgtttg 
19921 atggtagagt ggaaggacag gtagaccttt ttagaaacgc ccgtaatggt gttttaataa 
19981 cagaaggttc agtcaaaggt ctaacacctt caaagggacc agcacaagct agcgtcaatg 
20041 gagtcacatt aattggagaa tcagtaaaaa cacagtttaa ctactttaag aaagtagacg 
2 0101 gcattattca acagttgcct gaaacctact ttactcagag cagagactta gaggatttta 
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tcctattgga 
ccaaaaatct 



ttcatacagc 
agtcatggac 
tcaccactta 
acagatgcgc 
gactttgtcg 
acaattgact 
ttctacccaa 
tacaagatgc 
gttataccaa 
aatacactta 
gataaaggag 
cttgtcgaht 
tgtgcaacag 
aggaccaaac 
ggatttataa 
tcttggaatg 
acaaatgtaa 
ccgaaggaac 
aatcctatcc 
agaggaactg 
ctggaaaaag 
cttgttaaca 
agtgaccttg 
acttcatcta 
ttaactcagg 
catacgtttg 
aaatcaaatg 
gtgattatta 
gacaaccctt 
gataatgcat 
gaaaagtcag 
ctctatgttt 
aacactttga 
attcttacag 
gttggctatt 
gatgctgttg 
gagattgaca 
gtgagattcc 
ttcccttctg 
gtgctctaca 
ttgaatgatc 
gtaagacaaa 
gatgatttca 
ggtaattata 
gacatatcta 
tgttattggc 
tacagagttg 
aaattatcca 
ggtactggtg 
gatgtttctg 
tcaccttgct 
gttgctgttc 
caactcacac 
ggctgtctta 
gctggcattt 
attgtggctt 
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23521 
23581 
23641 
23701 
23761 
23821 
23881 
23941 
24001 
24061 
24121 
24181 
24241 
24301 
24361 
24421 
24481 
24541 
24601 
24661 
24721 
24781 
24841 
24901 
24961 
25021 
25081 
25141 
25201 
25261 
25321 
25381 
25441 
25501 
25561 
25621 
25681 
25741 
25801 
25861 
25921 
25981 
26041 
26101 
26161 
26221 
26281 
26341 
26401 
26461 
26521 
26581 
26641 
26701 
26761 
26821 



atactatgtc 
ctactaactt 
ccgtagattg 
aatatggtag 
atcgcaacac 
aatattttgg 
ggtcttttat 
agcaatatgg 
tcaatggact 
ctgctctagt 
aaataccttt 
ttctctatga 
aagaatcact 
atgctcaagc 
gtgtgctaaa 
ggttaattac 
ctgctgaaat 
gacaatcaaa 
cagccccgca 
tcaccacagc 
ttgtgtttaa 
ttactacaga 
acacagttta 
acttcaaaaa 
ctgtcgtcaa 
aatcactcat 
atgtttggct 
gttgcatgac 
agtttgatga 
cgaacttatg 
aaaaattgac 
agcctcactc 
cgctaccaaa 
gttcatttgc 
tgcaggtaag 
caacgcatgt 
attactttat 
accatataac 
aaaactcaaa 
agactatgtc 
aattactaca 
agacccaccg 
aatggatcca 
aagtgagtac 
tagcgtactt 
tgcgcttcga 
ggtttacgtc 
ggtctaaacg 
gcagacaacg 
gtaataggtt 
aacaggtttt 
gcttgttttg 
gcaatggctt 
tttgctcgta 
cctctccggg 
gtgatcattc 



tttaggtgct 
ttcaattagc 
taatatgtac 
cttttgcaca 
acgtgaagtg 
tggttttaat 
tgaggacttg 
cgaatgccta 
tacagtgttg 
tagtggtact 
tgctatgcaa 
gaaccaaaaa 
tacaacaaca 
attaaacaca 
tgatatcctt 
aggcagactt 
cagggcttct 
aagagttgac 
tggtgttgtc 
gccagcaatt 
tggcacttct 
caatacattt 
tgatcctctg 
tcatacatca 
cattcaaaaa 
tgaccttcaa 
cggcttcatt 
tagttgttgc 
ggatgactct 
gatttgttta 
aatgcttctc 
cctttcggat 
ataattgcgc 
aatttactgc 
gaggcgcaat 
agaattatta 
gatgccaact 
agtgtcacag 
gaagactacc 
gttgtacatg 
gacactggta 
aatgtgcaaa 
atttatgatg 
gaacttatgt 
ctttttcttg 
ttgtgtgcgt 
tactcgcgtg 
aactaactat 
gtactattac 
tcctattcct 
tgtacataat 
tgcttgctgt 
gtattgtagg 
cccgctcaat 
ggacaattgt 
gtggtcactt 
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gatagttcaa 
attactacag 
atctgcggag 
caactaaatc 
ttcgctcaag 
ttttcacaaa 
ctctttaata 
ggtgatatta 
ccacctctgc 
gccactgctg 
atggcatata 
caaatcgcca 
tcaactgcat 
cttgttaaac 
tcgcgacttg 
caaagccttc 
gctaatcttg 
ttttgtggaa 
ttcctacatg 
tgtcatgaag 
tggtttatta 
gtctcaggaa 
caacctgagc 
ccagatgttg 
gaaattgacc 
gaattgggaa 
gctggactaa 
agttgcctca 
gagccagttc 
tgagattttt 
ctgcaagtac 
ggcttgttat 
tcaataaaag 
tgctattrgt 
ttttgtacct 
tgagatgttg 
actttgtttg 
atacaattgt 
aaattggtgg 
gctatttcac 
ttgaaaatgc 
tacacacaat 
agccgacgac 
actcattcgt 
ctttcgtggt 
actgctgcaa 
ttaaaaatct 
tattattatt 
cgttgaggag 
agcctggatt 
aaagcttgtt 
tgtctacaga 
cttgatgtgg 
gtggtcattc 
gaccagaccg 
gcgaatggcc 



ttgcttactc 
aagtaatgcc 
attctactga 
gtgcactctc 
tcaaacaaat 
tattacctga 
aggtgacact 
atgctagaga 
tcactgatga 
gatggacatt 
ggttcaatgg 
accaatttaa 
tgggcaagct 
aacttagctc 
ataaagtcga 
aaacctatgt 
ctgctactaa 
agggctacca 
tcacgtatgt 
gcaaagpata 
cacagaggaa 
attgtgatgt 
ttgactcatt 
atcttggcga 
gcctcaatga 
aatatgagca 
ttgccatcgt 
agggtgcatg 
tcaagggtgt 
tactcttgga 
tgttcatgct 
tggcgttgca 
atggcagcta 
taccatctat 
ctatgccttg 
gctttgttgg 
ctggcacaca 
cgttactgaa 
ttattctgag 
cgaagtttac 
tacattcttc 
cgacggctct 
gacfcactagc 
ttcggaagaa 
attcttgcta 
tatfcgttaac 
gaactcttct 
ctgtttggaa 
cttaaacaac 
atgttactac 
ttcctctggc 
attaattggg 
cttagctact 
aacccagaaa 
ctcatggaaa 
ggacactccc 



taataacacc 
tgtttctatg 
atgtgctaat 
aggtattgct 
gtacaaaacc 
ccctctaaag 
cgctgatgct 
tctcatttgt 
tatgattgct 
tggtgctggc 
cattggagtt 
caaggcgatt 
gcaagacgtt 
taattttggt 
ggcggaggta 
aacacaacaa 
aatgtctgag 
ccttatgtcc 
gccatcccag 
cttccctcgt 
cttcttttct 
cgttattggc 
caaagaagag 
catttcaggc 
ggtcgctaaa 
atatattaaa 
catggttaca 
ctcttgtggt 
caaattacat 
tcaattactg 
acagcaacga 
tttcttgctg 
gccctttata 
tcacatcttt 
atatattttc 
aagtgcaaat 
cataactatg 
ggtgacggca 
gataggcact 
taccagcttg 
atctttaaca 
tcaggagttg 
gtgcctttgt 
acaggtacgt 
gtcacactag 
gtgagtttag 
gaaggagttc 
ctttaacatt 
tcctggaaca 
aatttgccta 
tcttgtggcc 
tgactggcgg 
tcgttgcttc 
caaacattct 
gtgaacttgt 
tagggcgctg 



attgctatac 
gctaaaacct 
ttgcttctcc 
gctgaacagg 
ccaactttga 
ccaactaaga 
ggcttcatga 
gcgcagaagt 
gcctacactg 
gctgctcttc 
acccaaaatg 
agtcaaattc 
gttaaccaga 
gcaatttcaa 
caaattgaca 
ctaatcaggg 
tgtgttcttg 
ttcccacaag 
gagaggaact 
gaaggtgttt 
ccacaaataa 
atcattaaca 
ctggacaagt 
attaacgctt 
aatttaaatg 
tggccttggt 
atcttgcttt 
tcttgctgca 
tacacataaa 
cacagccagt 
taccgctaca 
tttttcagag 
agggcttcca 
tgcttgtcgc 
tacaatgcat 
ccaagaaccc 
actactgtat 
tttcaacacc 
caggtgttaa 
agtctacaca 
agcttgttaa 
ctaa tccagc 
aagcacaaga 
taatagttaa 
ccatccttac 
taaaaccaac 
ctgatcttct 
gcttatcatg 
atggaaccta 
ttctaatcgg 
agtaacactt 
gattgcgatt 
cttcaggctg 
tctcaatgtg 
cattggtgct 
tgacattaag 
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2 6881 gacctgccaa aagagatcac tgtggctaca 
26941 gcgtcgcagc gtgtaggcac tgattcaggt 
27001 aactataaat taaatacaga ccacgccggt 
27061 taagtgacaa cagatgtttc atcttgttga 
27121 tatcattatg aggactttca ggattgctat 
27181 agtgagacaa ttatttaagc ctctaactaa 
27241 acctatggag ttagattatc cataaaacga 
27301 ttgtatttac atcttgcgag ctatatcact 
27361 tactaaaaga accttgccca tcaggaacat 
27421 ctgacaataa atttgcacta acttgcacta 
274 81 gtactcgaca tacctatcag ctgcgtgcaa 
27541 aagaggaggt tcaacaagag ctctactcgc 
27601 ttttaatact ttgcttcacc attaagagaa 
27661 cttctatttg tgctttttag cctttctgct 
27721 ttggttttca ctcgaaatcc aggatctaga 
27781 gaaacttctc attgttttga cttgtatttc 
27841 gcgctgtgca tctaataaac ctcatgtgct 
27901 gtaatactta tagcactgct tggctttgtg 
27961 ggcacactat ggttcaaaca tgcacaccta 
2 8021 gtggtgcgct tatagctagg tgttggtacc 
28081 gagacgtact tgttgtttta aataaacgaa 
28141 tcaaaccaac gtagtgcccc ccgcattaca 
28201 aaccagaatg gaggacgcaa tggggcaagg 
28261 aataatactg cgtcttggtt cacagctctc 
28321 cctcgaggcc agggcgttcc aatcaacacc 
28381 taccgaagag ctacccgacg agttcgtggt 
28441 agatggtact tctattacct aggaactggc 
28501 aaagaaggca tcgtatgggt tgcaactgag 
2 8561 ggcacccgca atcctaataa caatgctgcc 
2 8621 ttgccaaaag gcttctacgc agagggaagc 
28681 tcatcacgta gtcgcggtaa ttcaagaaat 
2 87 41 cctgctcgaa tggctagcgg aggtggtgaa 
2 8801 ttgaaccagc ttgagagcaa agtttctggt 
28861 actaagaaat ctgctgctga ggcatctaaa 
2 8921 cagtacaacg tcactcaagc atttgggaga 
28981 ggggaccaag acctaatcag acaaggaact 
29041 tttgctccaa gtgcctctgc attctttgga 
2 9101 tcgggaacat ggctgactta tcatggagcc 
2 9161 aaagacaacg tcatactgct gaacaagcac 
29221 gagcctaaaa aggacaaaaa gaaaaagact 
2 92 81 aagaagcagc ccactgtgac tcttcttcct 
29341 cttcaaaatt ccatgagtgg agcttctgct 
2 94 01 accacacaag gcagatgggc tatgtaaacg 
29461 tactcttgtg cagaatgaat tctcgtaact 
29521 atctcacata gcaatcttta atcaatgtgt 
29581 cattttcatc gaggccacgc ggagtacgat 
29641 ctgcctatat ggaagagccc taatgtgtaa 
2 9701 attttaatag cttcttagga gaatgacaaa 



tcacgaacgc tttcttatta caaattagga 
tttgctgcat acaaccgcta ccgtattgga 
agcaacgaca atattgcttt gctagtacag 
cttccaggtt acaatagcag agatattgat 
ttggaatctt gacgttataa taagttcaat 
gaagaattat tcggagttag atgatgaaga 
acatgaaaat tattctcttc ctgacattga 
atcaggagtg tgttagaggt acgactgtac 
acgagggcaa ttcaccattt caccctcttg 
gcacacactt tgcttttgct tgtgctgacg 
gatcagtttc accaaaactt ttcatcagac 
cactttttct cattgttgct gctctagtat 
agacagaatg aatgagctca ctttaattga 
attccttgtt ttaataatgc ttattatatt 
agaaccttgt accaaagtct aaacgaacat 
tctatgcagt tgcatatgca ctgtagtaca 
tgaagatcct tgtaaggtac aacactaggg 
ctctaggaaa ggttttacct tttcatagat 
atgttactat caactgtcaa gatccagctg 
ttcatgaagg tcaccaaact gctgcattta 
caaattaaaa tgtctgataa tggaccccaa 
tttggtggac ccacagattc aactgacaat 
ccaaaacagc gccgacccca aggtttaccc 
actcagcatg gcaaggagga acttagattc 
aatagtggtc cagatgacca aattggctac 
ggtgacggca aaatgaaaga gctcagcccc 
ccagaagctt cacttcccta cggcgctaac 
ggagccttga atacacccaa agaccacatt 
accgtgctac aacttcctca aggaacaaca 
agaggcggca gtcaagcctc ttctcgctcc 
tcaactcctg gcagcagtag gggaaattct 
actgccctcg cgctattgct gctagacaga 
aaaggccaac aacaacaagg ccaaactgtc 
aagcctcgcc aaaaacgtac tgccacaaaa 
cgtggtccag aacaaaccca aggaaatttc 
gattacaaac attggccgca aattgcacaa 
atgtcacgca ttggcatgga agtcacacct 
attaaattgg atgacaaaga tccacaattc 
attgacgcat acaaaacatt cccaccaaca 
gatgaagctc agcctttgcc gcagagacaa 
gcggctgaca tggatgattt ctccagacaa 
gattcaactc aggcataaac actcatgatg 
ttttcgcaat tccgtttacg atacatagtc 
aaacagcaca agtaggttta gttaacttta 
aacattaggg aggacttgaa agagccacca 
cgagggtaca gtgaataatg ctagggagag 
aattaatttt agtagtgcta tccccatgtg 
aaaaaaaaaa aa 
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1 - ATATTAGGTTTTTACCTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTT - 60 
-ILGFYLPRKSQPTS ISCRSV 
-Y*VFTYPGKANQPRSLVDIjF 
IRFLPTQEKPTNLDLL* ICS 
61 - CTCTAAACGAACTTTAAAATCTGTGTAGCTGTCGCTCGGCTGCATGCCTAGTGCACCTAC - 120 

- L * TNFKICVAVARLHA*CTY 

- SKRTLKSV*LSLGCMPSAPT 

LNEL*NLCSCRSAACLVHLR 
121 - GCAGTATAAACAATAATAAATTTTACTGTCGTTGACAAGAAACGAGTAACTCGTCCCTCT - 180 

- A V * T I INFTVVDKKRVTRPS 
~QYKQ**ILLSLTRNE*LVPL 

SINNNKFYCR*QETSNSSLF 
181 - TCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGATGATCAGCATACCTAGGTTTG - 240 
-SADCLRFRPCCSRSSAYLGF 
LQTAYGFVRVAVDKQHT* VS 
CRLLTVSSVLQSIISIPRFR 
241 - GTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAACA - 300 

- V R V * PKGKMESLVLGVNEKT 

- SGCDRKVRWRALFLiVSTRKH 

PGVTER* DGEPCSWCQRENT 
301 - CACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGG - 360 
-HVQLSLPVLQVRDVLVRGFG 
-TSNSVCLSFRLETC*CVASG 
RPTQFACPSG*RRASAWLRG 
361 - GACTCTGTGGAAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGT - 420 
-DSVEEALSEAREHLKNGTCG 
TLWKRPYRRHVNTSKMALVV 
LCGRGPIGGT*TPQKWHLWS 
421 * CTAGTAGAGCTGGAAAAAGGCGTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAA - 480 
-LVELEKGVLPQLEQPYVFIK 
-**SWKKAYCPSLNSPMCSLN 
SRAGKRRTAPA* TALCVH*T 
481 - CGTTCTGATGCCTTAAGCACCAATCACGGCCACAAGGTCGTTGAGCTGGTTGCAGAAATG - 54 0 

- R S DALSTNHGHKVVELVAEM 
-VLMP*APITATRSLSWLQKW 

F * CLKHQSRPQGR*AGCRNG 
541 - GACGGCATTCAGTACGGTCGTAGCGGTATAACACTGGGAGTACTCGTGCCACATGTGGGC - 600 

- D G I QYGRSGI TLGVLVPHVG 
-TAFSTVVAV*HWEYSCHMWA 

RHSVRS*RYNTGSTRATCGR 
601 - GAAACCCCAATTGCATACCGC2\ATGTTCTTCTTCGTAAGAACGGTAATAAGGGAGCCGGT - 660 

- E T PIAYRNVLLRKNGNKGAG 
-KPQLHTAMFFFVRTVIREPV 

NPttCIPQCSSS*ER**GSRW 
661 - GGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAGGTGACGAGCTTGGCACTGAT - 720 
~GH SY G I DLKSYDLGDELGTD 
-VIAMASI*SLMT*VTSLALI 
S*LWHRSKVL*LR*RAWH*S 
721 - CCCATTGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGCACTCCGTGAA - 7 80 
-PI EDYEQNWNTKHGSGALRE 
-PLKIMNKTGTLSMAVVHSVN 
H*RL*TKLEH*AWQWCTP*T 
7 81 - CTCACTCGTGAGCTCAATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGC - 840 

- L T RELNGGAVTRYVDNNFCG 
-SLVSSMEVQSLAMSTTISVA 

HS*AQWRCSHSLCRQQFLWP 
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841 - CCAGATGGGTACCCTCTTGATTGCATCAAAGATTTTCTCGCACGCGCGGGCAAGTCAATG - 900 
-PDGYPLDCIKDFLARAGKSM 
-QMGTLLIASKIFSHARASQC 
RWVPS*LHQRFSRTRGQVNV 
901 - TGCACTCTTTCCGAACAACTTGATTACATCGAGTCGAAGAGAGGTGTCTACTGCTGCCGT - 960 
-CTLSEQLDYIES KRGVYCCR 
-ALFPNNLITSSRREVSTAAV 
HS FRTT*LHRVEERCLLLP* 
961 - GACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTGATAAGAGCTACGAGCACCAG - 1020 
-DHEHE IAWFTERSDKSYEHQ 
-TMSMKLPGSLSALI RATSTR 
P*A*NCLVH*AL**ELRAPD 
1021 - ACACCCTTCGAAATTAAGAGTGCCAAGAAATTTGACACTTTCAAAGGGGAATGCCCAAAG - 108 0 
-TP F E I KSAKKFDTFKGECPK 
-HPSKLRVPRNLTLSKGNAQS 
TLRN*ECQEI *HFQRGMPKV 
1081 - TTTGTGTTTCCTCTTAACTCAAAAGTCAAAGTCATTCAACCACGTGTTGAAAAGAAAAAG - 1140 

- F V F P LNSKVKVI QPRVEKKK 

- LCFLLTQKSKS FNHVLKRKR 

CVSS*LKSQSHSTTC*KEKD 
1141 - ACTGAGGGTTTCATGGGGCGTATACGGTCTGTGTACCCTGTTGCATCTCCACAGGAGTGT - 1200 
-TEGFMGRIRSVY PVASPQEC 
-LRVSWGVYALCTLLHLHRSV 
*GFHGAYTLCVPCCI S T G V * 
1201 - AACAATATGCACTTGTCTACCTTGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAG - 1260 
-NNMH LSTLMKCNHC DEVSWQ 
-TICTCLP**NVIIAMKFHGR 
QYALVYLDEM*SLR* SFMAD 
1261 - ACGTGCGACTTTCTGAAAGCCACTTGTGAACATTGTGGCACTGAAAATTTAGTTATTGAA - 1320 
-TCDFLKATCEHCGTENLVIE 
~RATF*KPLVNIVALKI*LIiK 
VRLSESHIi*TL'WH*KFSY*R 
1321 - GGACCTACTACATGTGGGTACCTACCTACTAATGCTGTAGTGAAAATGCCATGTCCTGCC - 1380 
-GPTTCGYLPTNAVVKMPCPA 
-DLLHVGTYLLML* * KCHVLP 
TYYMWVPTY*CCSENAMSCL 
1381 - TGTCAAGACCCAGAGATTGGACCTGAGCATAGTGTTGCAGATTATCACAACCACTCAAAC - 1440 

- C Q D P E IGPEHSVADYHNHSN 

- VKTQRLDLSIVLQIITTTQT 

SRPRDWT*A*CCRLSQPLKH 
1441 - ATTGAAACTCGACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTGTGTTTGCC - 1500 
-I ET RLRKGGRTRCFGGCVFA 
-LKLDSAREVGLDVLEAVCLP 
*NSTPQGR*D*MFWRLCVCL 
1501 - TATGTTGGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCG1GCTAGTGCTGATATTGGC - 1560 

- Y V G C YNKRAYWV PRASADI G 
-MLAAII SVPTGFLVLVLILA 

CWLL**ACLLGSSC*C*YWL 
15 61 - TCAGGCCATACTGGCATTACTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAG - 1620 
-SGHTG ITGDNVETLNE DL LE 
~QAILALLVTMWRP*MRISLR 
RPYWHYW*QCGDLE*GSP*D 
1621 - ATACTGAGTCGTGAACGTGTTAACATTAACATTGTTGGGGATTTTCATTTGAATGAAGAG - 1680 

- I I» S RERVNIN IVGDFHLNEE 

- Y *VVNVLTLTLLAI FI * M K R 

TES*TC*H*HCWRFSFE*RG 
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1681 - GTTGCCATCATTTTGGCATCTTTCTCTGCTTCTACAAGTGCCTTTATTGACACTATAAAG - 1740 
-VAIILASFSASTSAFI DTIK 
-LPSFWHLSLLLQVPLLTL*R 
CHHFGI FLCFYKCLY* HYKE 

17 41 - AGTCTTGATTACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTATAAAGTTACC - 1800 
-SLDYKSFKTIVESCGNYKVT 

- VLITSLSKPLLSPAVTIKLP 

S*LQVFQNHC*VLR*L*SYQ 
1801 - AAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCA - 18 60 
-KGKPVKGAWNI GQQRSVLTP 
-RES P * KVLGTLDNRDQF* HH 
GKARKRCLBHWTTEIS FNTT 
1861 - CTGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATCAGATCAATTTTTGCGCGCACACTT - 1920 
-LCGFP SQAAGVI RS I FARTL 
-CVVFPHRLLVLSDQFLRAHL 
VWFS LTGCWCYQINFCAHT* 
1921 - GATGCAGCAAACCACTCAATTCCTGATTTGCAAAGAGCAGCTGTCACCATACTTGATGGT - 1980 
-DAANHS I PDLQRAAVT IL DG 
-MQQTTQFLICKEQLSPYLMV 
CSKPLNS*FAKSSCHHT*WY 
1981 - ATTTCTGAACAGTCATTACGTCTTGTCGACGCCATGGTTTATACTTCAGACCTGCTCACC - 204 0 
-I SEQS LRLVDAMVYTS DLLT 
-FLNSHYVLSTPWFILQTCSP 
F * TV ITSCRRHGLYFRPAHQ 
2041 - AACAGTGTCATTATTATGGCATATGTAACTGGTGGTCTTGTACAACAGACTTCTCAGTGG - 2100 

- N S V I I MAYVTGGLVQQT SQW 

- TVSLLWHM*LVVLYNRLLSG 

QCHYYGICNWWSCTTDFSVV 
2101 - TTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATCTTTGAATGGATTGAG - 2160 
-LSNLLGTTVEKLRPIFEWIE 
-CLIFWALLLKNSGLSLNGLR 
V*SFGHYC*KTQAYL*MD*G 
2161 - GCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATGCTTGGGAGATTCTCAAATTTCTC - 2220 
-AKLSAGVEFLKDAW E I LKFL 
-RNLVQELNFSRMLGRFSNFS 
ET*CRS*ISQGCLGDSQISH 
2221 - ATTACAGGTGTTTTTGACATCGTCAAGGGTCAAATACAGGTTGCTTCAGATAACATCAAG - 228 0 
-ITGVFDI VKGQIQVAS DNIK 
-LQVFLTSSRVKYRLLQITSR 
YRCF*HRQGSNTGCFR*HQG 
2281 - GATTGTGTAAAATGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAA - 234 0 
-DCVKCFI DVVNKALEMC IDQ 
-IV*NASLMLLTRHSKCALIK 
LCKM1jH*CC*QGTRNVH*SS 
2341 - GTCACTATCGCTGGCGCAAAGTTGGGATCAC1CAACTTAGGTGAAGTCTTCATCGCTCAA - 24 00 

- V T IAGAKLRSLNLGEV FIAQ 
-SLSLAQSCDHST*VKSSSLK 

HYRWRKVAI TQIjR* slhrsk 
2 4 01 - AGCAAGGGACTTTACCGTCAGTGTATACGTGGCAAGGAGCAGCTGCAACTACTCATGCCT - 24 60 
-SKGLYRQCIRGKEQLQLLMP 
-ARDFTVSVYVARSSCNYSCL 
QGTLP SVYTVtfQGAAATTHAS 
2 4 61 - CTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATTCACATGACACAGTACTTACC - 252 0 
-LKAPKEVTFLEGDSH DTVLT 
-LRHQKK* PFLKVIHMTQYLP 
*GTKRSNLS*R*FT*HSTYL 
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2521 - TCTGAGGAGGTTGTTCTCAAGAACGGTGAACTCGAAGCACTCGAGACGCCCGTTGATAGC - 25 8 0 
-SEEVVLKNGELEALETPVDS 
-LRRLFSRTVNSKHSRRPLIA 
*GGCSQER*TRSTRDAR**L 
2581 - TTCACAAATGGAGCTATCGTCGGCACACCAGTCTGTGTAAATGGCCTCATGCTCTTAGAG - 2640 
-FTNGAIVGTPVCVNGLMLLE 
-SQMELSSAHQSV*MASCS*R 
HKWSYRRHTSLCKWPHALRD 
2641 - ATTAAGGACAAAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTC - 2700 
-IKDKEQYCALSPGLLATNNV 
-LRTKNNTAHCLLVYWLQTMS 
*GQRTILRIVSWFTGYKQCL 
2701 ~ TTTCGCTTAAAAGGGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGATACTGTTTGG - 27 60 
-FRLKGGAP IKGVTFGEDTVW 
-FA*KGVHQLKV* PLEKILFG 
SLKRGCTN*RCMIjWRRYCLG 
27 61 - GAAGTTCAAGGTTACAAGAATGTGAGAATCACATTTGAGCTTGATGAACGTGTTGACAAA - 282 0 
-EVQGYKNVRITFELDERVDK 

-kfkvtrm*eshlslmnvltk 
ssrlqecenhi*a**tc*qs 

2 821 - GTGCTTAATGAAAAGTGCTCTGTCTACACTGTTGAATCCGGTACCGAAGTTACTGAGTTT - 28 8 0 
-VLNEKCSVYTVESGTEVTEF 
-CLMKSALSTLLNPVPKLLSL 
A**KVLCLHC*IRYRSY*VC 

2 881 - GCATGTGTTGTAGCAGAGGCTGTTGTGAAGACTTTACAACCAGTTTCTGATCTCCTTACC - 294 0 

-ACVVAEAVVKTLQPV S DLLT 
H V L * QRLL * RLYNQFLI S I> P 
MCCSRGCCEDFTTSF*SPYQ 
2941 - AACATGGGTATTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGATGCT - 3000 

- N M G I DLDEWSVATFYLFDDA 
-TWVLILMSGV*LHSTYLMML 

HGY*S**VECSYILLI**CW 
3001 - GGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAA - 3060 
-GEENFSSRMYCS FYPPDEEE 
-VKKTFHHVCIVPFTLQMRKK 
*RKLFITYVLFLLPSR*GRR 

3 061 - GAGGACGATGCAGAGTGTGAGGAAGAAGAAATTGATGAAACCTGTGAACATGAGTACGGT - 3120 

-EDDAECEEEE IDETCEHEYG 
-RTMQSVRKKKLMKPVNMSTV 
GRCRV*GRRN**NL*T*VRY 
3121 - ACAGAGGATGATTATCAAGGTCTCGCTCTGGAATTTGGTGCCTCAGCTGAAACAGTTCGA - 318 0 
-TEDDYQGLPLEFGASAETVR 

- Q R M I 1KVSLWNLVPQLKQFE 

RG*LSRSPSGIWCLS*NSSS 
3181 - GTTGAGGAAGAAGAAGAGGAAGACTGGCTGGATGATACTACTGAGCAATCAGAGATTGAG - 3240 
-VEEEEEEDWLDDTTEQSE IE 
-LRKKKRKTGWMILLSNQRLS 
*GRRRGRLAG*YY*AIRD*A 
3241 ~ CCAGAACCAGAACCTACACCTGAAGAACGAGTTAATCAGTTTACTGGTTATTTAAAACTT - 3300 
-PEPEPTPEEPVNQFTGYLKL 

- QNQNLHLKNQLISLLVI*NL 

RTRTYT*RTS*SVYWLFKTY 
3301 - ACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAGTGCTAATCCT - 3360 

- T DNVAI KCVDIVKEAQSANP 
-LTMLPLNVLTSLRRHKVLIL 

*QCCH*MC*HR*GGTKC*SY 
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3361 - ATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTGC^ - 3420 

- M V I VNAAN IHLKHGGGVAGA 
-W*L*MLLTYT*NMVVV*QVH 

SDCKCC*HTPETWWWCSRCT 
3 421 - CTCAACAAGGCAACCAATGGTGCCATGCAAAAGGAGAGTGATGATTACATTAAGCTAAAT - 3480 
-LNKATSIGAMQKE SD D Y I KLN 
-STRQPMVPCKRRVMITLS*M 
QQGNQWCHAKGE**LH*AKW 
34 81 - GGCCCTCTTACAGTAGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGT - 354 0 
-GPLTVGGSCLLSGHNLAKKC 
-ALLQ*EGLVCFLDt ILLRSV 
PSYSRRVLFAFWT*SC*EVS 
3541 - CTGCATGTTGTTGGACCTAACCTAAATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCA - 3600 
-LHVVG PNLNAGE D I QLLKAA 
-CMLLDLT*MQVRTSSFLRQH 
ACCWT*PKCR*GHPAS*GSI 
3 601 - TATGAAAATTTCAATTCACAGGACATCTTACTTGCACCATTGTTGTCAGCAGGCATATTT - 3660 
-YEN FN SQDILL APLLSAGI F 
-MKISIHRTSYLHHCCQQAYL 
*KFQFTGHLTCTIVVSRHIW 
3 661 - GGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGACGGTTCGTACACAGGTTTAT - 372 0 
-GAKPLQSLQVCVQTVRTQVY 
-VLNHFSLYKCACRRFVHRFI 
C* TTSVFTSVRADGSYTGLY 
3721 - ATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACCTG - 3780 

- I AVNDKALYEQVVM DYLDNL 
-LQSMTKLFMSRLSWIILIT* 

CSQ*QSSL*AGCHGLS**PE 
37 81 - AAGCCTAGAGTGGAAGCACCTAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACT - 384 0 
-KPRVEAPKQEEPPNTEDSKT 
-SLEWKHLNKRSHQTQKIPKL 
A * SGST*TRGATKHRRFQN* 
3841 - GAGGAGAAATGTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATT - 3900 
-EBKSVVQKPVDVKPKIKACI 
-RRNLSYRSLSM* SQKLRPAL 
GEICRTEACRCEAKN*GLH* 
3 901 - GATGAGGTTACCACAACACTGGAAGAAACTAAGTTTCTTACCAATAAGTTACTCTTGTTT - 3960 
-DEVTTTLEETKFLTNKLLLF 
-MRLPQHWKKLS FLPISYSCL 
*GYHNTGRN*VSYQ*VTLVC 

3 961 - GCTGATATCAATGGTAAGCTTTACCATGATTCTCAGAAGATGCTTAGAGGTGAAGATATG - 4 02 0 

-ADINGKLYHDSQNMLRGEDM 
-LISMVSFTMILRTCLEVKXC 
* YQW*ALP*FSEHA*R*RYV 

4 021 - TCTTTCCTTGAGAAGGATGCACCTTACATGGTAGGTGATGTTATCACTAGTGGTGATATC - 408 0 

- S FLEKDAPYMVGDVITSGDI 
-LSLRRMHLTW*VMLSLVVIS 

FP*EGCTLBGR*CYH*W*YH 
4 081 - ACTTGTGTTGTAATACCGTCCAAAAAGGCTGGTGGCACTACTGAGATGCTCTCAAGAGCT - 4140 
-TCVVI PSKKAGGTTEMLSRA 

- L V L * YPPKRLVALLRCSQEL 

LCCNTLQKGWWHY* DALKS F 
4141 - TTGAAGAAAGTGCCAGTTGATGAGTATATAACCACGTACCGTGGACAAGGATGTGCTGGT - 4200 
-LKKVPVDEYI TTYPGQGCAG 
~ * RKCQLMS I * PRTLDKDVLV 

EESAS**VYNHVPWTRMCWL 
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4201 - TATACACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATTTTATGTACTA - 4260 
-YTLEEAKTALKKCKSAFYVL 
I HLRKLRLLLRNANLHFMYY 
YT*GS*DCS*EMQICILCTT 
4261 - CCTTCAGAAGCACCTAATGCTAAGGAAGAGATTCTAGGAACTGTATCCTGGAATTTGAGA - 4320 
-PSEAPNAKEEILGTVSWNLR 
-IiQKHLMLRKRF*BLYPGI*B 
FRST *C*GRDSRNCILEFER 
4321 - GAAATGCTTGCTCATGCTGAAGAGACAAGAAAATTAATGCCTATATGCATGGATGTTAGA - 4380 
-EMLAHAEETRKLMP I CMDVR 
-KCLLMLKRQEN * CLYAWMLE 
NACSC*RDKKINAYMHGC*S 
4 381 - GCCATAATGGCAACCATCCAACGTAAGTATAAAGGAATTAAAATTCAAGAGGGCATCGTT - 4440 
-AIMATIQRKYKGIKIQEGIV 

- P * WQPSNVSIKELKFKRASL 

HNGNHPT*V*RN*NSRGHR* 
4441 - GACTATGGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCTTCTATTATTACG - 4500 
-DYGVRFFFYTSKEPVASI I T 
-TMVSDSSFILVKSli*LLLLR 
LWCPILLLY* *RACSFYYYE 
4501 - AAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGTGACACATGGT - 4560 
-KINS LNEPLVTMPI GYVTHG 
S * TL*MSRLSQCQLVM* HMV 
AELSK*AACHNANWLC DTWF 
4561 - TTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCGTAGTGTCA - 4 62 0 
-FN LEEAARCMRS LKAPAVVS 
-LILKRLRAVCVLLKLLP*CQ 
*S*RGCALYAFS*SSCRSVS 
4 621 - GTATCATCACCAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACA ~ 4 680 
-VSSPDAVTTYNGYLTSSSKT 
-YHHQMLLLHIMDTSLRHQRH 
IITRCCYYI*WIPHFVIKDI 
4 681 - TCTGAGGAGCACTTTGTAGAAACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTAT - 4740 
-SEEHFVETVSLAGSYRDWSY 

- LRSTL*KQFLWLALTEIGPI 

*GALCRNSFFGWLLQRLVLF 
4741 - TCAGGACAGCGTACAGAGTTAGGTGTTGAATTTCTTAAGCGTGGTGACAAAATTGTGTAC - 4800 
-SGQRTELGVEFLKRGDKIVY 
-QDSVQS*VLNFLSVVTKLCT 
RTAYRVRC*IS*AW*QNCVP 
4 8 01 - CACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGGTTCTTTCACTTGACAAA - 48 60 
-HTLE S PVEFHLDGEVLSLDK 
-TLWRAPSSFILTVRFFHLTN 
HSGEPRRVSS*R*GSFT*QT 
4 861 - CTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAAAAGTGTTCACAACTGTGGAC - 4 92 0 
-LKSLLSLREVKT IKVFTTVD 
" *RVSYPCGRLRL*KCSQLWT 
KESLIPAGG* DYKSVHNCGQ 
4 921 - AACACTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTGGT - 4 98 0 
-NTNLHTQLVDMSMTYGQQFG 

-TLISTHSLWICL*HMDSSLV 
H*SPHTACGYVYDIWTAVWS 
4 981 - CCAACATACTTGGATGGTGCTGATGTTACAAAAATTAAACCTCATGTAAATCATGAGGGT - 504 0 
-PTYLDGADVTKIKPHVNHEG 
~ QHTWMVLMLQKLNLM* IMRV 
NILGWC*CYKN*TSCKS*G* 
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5041 - AAGACTTTCTTTGTACTACCTAGTGATGACACACTACGTAGTGAAGCTTTCGAGTACTAC - 5100 

- K T FFVLPS DDTLRS EAFEYY 
-RLSLYYLVMTHYVVKLSSTT 

DFLCTT* **HTT* * S F R V L P 
5101 - CATACTCTTGATGAGAGTTTTCTTGGTAGGTACATGTCTGCTTTAAACCACACAAAGAAA - 5160 
-HTLDE SFLGRYMSALNHTKK 
-ILLMRVFLVGTCLL*TTQRN 
YS**EFSW*VHVCFKPHKEM 
5161 - TGGAAATTTCCTCAAGTTGGTGGTTTAACTTCAATTAAATGGGCTGATAACAATTGTTAT - 5220 
-WKFPQVGGLTSIKWADNNCY 
-GNFLKLVV*IiQLNGLI T I V I 
EISSSWWFNFN*MG* * Q L L F 
5221 - TTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATGCACCAGCACTT - 5280 

- L S SVLLALQQLEVK FNAPAL 
-CLVFY * HFNSLKSNSMHQHF 

V*CFISTSTA*SQIQCTSTS 
5281 - CAAGAGGCTTATTATAGAGCCCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTC - 5340 
-QEAYYRARAGDAAN FCALIL 
-KRLIIEPVLVMLLTFVHSYS 
RGLL*SPCW*CC*LLCTHTR 
5341 - GCTTACAGTAATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTT - 5400 
-AYSNKTVGELGDVRETMTHL 
-LTVIKLLASLVMSEKL* PIF 
LQ**NCWRAW*CQRNYDPSS 
5401 - CTACAGCATGCTAATTTGGAATCTGCAAAGCGAGTTCTTAATGTGGTGTGTAAACATTGT - 5460 
-LQHANLESAKRVLNVVCKHC 
-YSMLIWNLQSEFLMWCVNIV 
TAC*FGICKASS*CGV*TLW 
5461 - GGTCAGAAAACTACTACCTTAACGGGTGTAGAAGCTGTGATGTATATGGGTACTCTATCT - 5520 
-GQKTT TLTGVEAVMYMGTLS 

- VRKLLP*RV*KL*CIWVLYL 

SENYYLNGCRSCDVYGYSIL 
5521 - TATGATAATCTTAAGACAGGTGTTTCCATTCCATGTGTGTGTGGTCGTGATGCTACACAA - 5580 
-YDNLKTGVS IPCVCGRDATQ 
-MIILRQVFPFHVCVVVMLHN 
**S*DRCFHSMCVWS*CYTI 
5581 - TATCTAGTACAACAAGAGTCTTCTTTTGTTATGATGTCTGCACCACCTGCTGAGTATAAA - 5 64 0 
-YLVQQESS FVMMSAPPAEYK 

- I*YNKSLLLL*CLHHLLSIN 

SSTTRVFFCYDVCTTC*V*I 
5641 - TTACAGCAAGGTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGTGGTCAT - 5700 
-LQQGT FLCANEYTGNYQCGH 
-YSKVHSYVRMSTLVTISVVI 
TARYILMCE*VHW*LSVWSL 
5701 - TACACTCAT AT AACT GCT AAGGAG ACCCTCTATCGT AT TG ACGGAGCTCACCTTACAAAG - 5760 
-YTH I TAKETLYRIDGAHLTK 
-TLI*I*IiRRPSIVLTELTLQR 
HSYNC*GDPLSY*RSSPYKD 
5761 - ATGTCAGAGTACAAAGGACCAGTGACTGATGTTTTCTACAAGGAAACATCTTACACTACA - 5820 
-MSEYKGPVTDVFYKETSYTT 
-CQSTKDQ*LMFSTRKHLTL»Q 
VRVQRTSD*CFLQGNILHYN 
5821 - ACCATCAAGCCTGTGTCGTATAAACTCGATGGAGTTACTTACACAGAGATTGAACCAAAA - 5880 
-TIKPVSYKLDGVTYTEIEPK 

- PSSLCRINSMELLTQRLNQN 

HQACVV*TRWSYLHRD*TKI 
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5 881 - TTGGATGGGTATTATAAAAAGGATAATGCTTACTATACAGAGCAGCCTATAGACCTTGTA - 5940 
-LDGYY KKDNAYYTEQP I DLV 
-ty?MGIIKRIMLTIQSSL*TX»Y 
GWVL*KG' V CLLYRAAYRPCT 
5941 - CCAACTCAACCATTACCAAATGCGAGTTTTGATAATTTCAAACTCACATGTTCTAACACA - 6000 
-PTQPLPNAS FDNFKLTCSNT 
-QLNHYQMRVLI ISNSEVLTQ 
NSTITKCEF* *FQTHMF*HK 
6001 - AAATTTGCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCAGCTTCACGAGAGCTA - 6060 
-KFADDLWQMTGFTKPASREL 
-NLLMI * IK*QASQSQLHESY 
I C * * FKSNDRLHKAS FTRAI 
6061 - TCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTATAGACACTAT - 6120 

- S V T FF PDLNGDVVA I DYRHY 
-LSHSSQT*MAM*WLLT IDTI 

CHILPRLEWRCSGY*L*TLF 
6121 - TCAGCGAGTTTCAAGAAAGGTGCTAAATTACTGCATAAGCCAATTGTTTGGCACATTAAC - 616 0 

- S A S FKKGAKLLHKP IVWHIN 
-QRVSRKVLNYCISQLFGTLT 

SEFQERC* ITA*ANCLAH*P 
6181 - CAGGCTACAACCAAGACAACGTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGT - 6240 
-QATTKTTFKPNTWCLRCLWS 
-RLQPRQRSNQTLGVYVVFGV 
GYNQDNVQTKHLVFTLSLEY 
62 41 - ACAAAGCCAGTAGATACTTCAAATTCATTTGAAGTTCTGGCAGTAGAAGACACACAAGGA - 6300 
-TKPVDTSN SFEVLAVEDTQG 

- Q S Q * ILQIHLKFWQ*KTHKE 

KASRYFKFI* SSGSRRHTRN 
6301 - ATGGACAATCTTGCTTGTGAAAGTCAACAACCCACCTCTGAAGAAGTAGTGGAAAATCCT - 6360 
-MDNLACESQQPT SEEVVENP 
-WTILLVKVNNPPLKK^WKIL 
GQSCL*KSTTHL*RSSGKSY 
6361 - ACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAAGTTGTAGGCAATGTC - 6420 
-TIQKEVIECDVKTTEVVGNV 

- P Y R R K S * SVT*KLPKL *AMS 

HTEGSHRV*RENYRSCRQCH 
6421 - ATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATGAGGATCTT - 6480 
-ILKPSDEGVKVTQELGHEDL 
YLNHQMKVLK* HKS * V M R I L 
T*TIR*RC*SNTRVRS*G*SY 
6481 - ATGGCTGCTTATGTGGAAAACACAAGCATTACCATTAAGAAACCTAATGAGCTTTCACTA - 654 0 
-MAAYVENT SITIKKPNELSL 
-WLLMWKTQALPLRNLMSFH* 
GCLCGKHKHYH*ET* * AFTS 
6541 - GCCTTAGGTTTAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGG - 6600 
-ALGLKT IATHGIAAINSVPW 
p*V*KQLPLMVLLQLIVFLG 
LRFKNNCHSWYCCN** CSLE 
6601 - AGTAAAATTTTGGCTTATGTCAAACCATTCTTAGGACAAGCAGCAATTACAACATCAAA1 - 6660 
-SKILAYVKPF LGQAAITTSN 
-VKFWLMSNHS* DKQQLQHQI 
*NFGLCQTILRTSSNYNIKL 
6661 - TGCGCTAAGAGATTAGCACAACGTGTGTTTAACAATTATATGCCTTATGTGTTTACATTA - 6720 
-CAKRLAQRVFNNYMPYVFTL 

- A L R D * HNVCLTI ICLMCLHY 

R*EISTTCV*QLYALCVYII 
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6721 - TTGTTCCAATTGTGTACTTTTACTAAAAGTACCAATTCTAGAATTAGAGCTTCACTACCT - 6780 
-LFQLCTFTKSTNSRIRASLP 
-CSNCVLLLKVPILELELHYL 
VP I V Y F Y * KYQF*N*SFTTY 

6781 - ACAACTATTGCTAAAAATAGTGTTAAGAGTGTTGCTAAATTATGTTTGGATGCCGGCATT - 684 0 

- T T IAKNSVKSVAKLCLDAGI 
-QLLLKIVLRVLLNYVWMPAL 

NYC*K*C*ECC*'IMFGCRH* 
6841 - AATTATGTGAAGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTG - 6900 
-NYVKSPKFSKLFTI AMWLLL 
~IM*SHPNFLNCSQSLCGYCC 
LCEVTQIF*IVHNRYVAIVV 
6901 - TTAAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTCTTATCT - 6960 
-LSICLGSLICVTAAFGVLLS 
-*VFA*VL*SV*LLLLVYSYL 
KYLLRFSNLCNCCFWCTLI * 

6 961 - AATTTTGGTGCTCCTTCTTATTGTAATGGCGTTAGAGAATTGTATCTTAATTCGTCTAAC - 7 020 

- N FGAPSYCNGVRELYLNSSN 

ILVLLLIVMALENCILIRLT 
FWCSFLL*WR*RIVS*FV*R 

7 021 - GTTACTACTATGGATTTCTGTGAAGGTTCTTTTCCTTGCAGCATTTGTTTAAGTGGATTA - 7 08 0 

-VTTMDFCEGSFPCS ICLSGL 
LLLWISVKVLFLAAFV*VD* 
YYYGFL*RFFSLQHLFKWIR 
7081 - GACTCCCTTGATTCTTATCCAGCTCTTGAAACCATTCAGGTGACGATTTCATCGTACAAG - 7140 
-DSLDSYPALETIQVTISSYK 

- TPLI LIQLLKPPR*RFHRT"S 

LP*FLSSS *NHSGDDFIVQA 
7141 - CTAGACTTGACAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCATATATGTTGTTCACA - 7200 

- L DLTILGLAAEWVLAYMLFT 
-*T*QF*VWPLSGFWHICCSQ 

RLDNFRSGR*VGFGIYVVHK 
7201 - AAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGGCTATTTTGCT - 72 60 
-KFFYLLGLSAIMQVFFGYFA 
~ NSFIY*VFQL*CRCSLAILL 
ILLFIRSFSYNAGVLWLFC* 
7261 - AGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATTGTACAAATGGCA - 7320 
-SHFISNSWLMWFIISIVQMA 

- VISSAILGSCGLSLVLYKWH 

S FHQQFLAHVVYH * Y C T N G T 
7321 - CCCGTTTCTGCAATGGTTAGGATGTACATCTTCTTTGCTTCTTTCTACTACATATGGAAG - 7380 
-PVSAMVRMY I FFAS FYYIWK 
PFLQWLGCTSSLLLSTTYGR 
RFCNG* DVHLLCFFLLHMEE 
7381 - AGCTATGTTCATATCATGGATGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGC - 7440 

- S YVHIMDGCTS STCMMCYKR 
-AMFI SWMVAPLRLA*CAISA 

LCSYHGWLHLFDLHDVL*AQ 
7441 - AATCGTGCCACACGCGTTGAGTGTACAACTATTGTTAATGGCATGAAGAGATCTTTCTAT - 7500 
-NRATRVSCTTI VNGMKRS FY 

- IVPHALSVQLLLMA*RDLSM 

SCHTR*VYNYC*WHEEIFLC 
7501 - GTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAATTGGAATTGTCTCAATTGT - 7560 
-V YANGGRG FCKTHNWNCLNC 
-SMQMEAVASARLTIGIVSIV 
LCKWRPWLLQDSQLELSQL* 
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7561 - GACACATTTTGCACTGGTAQTACATTCATTAGTGATGAAGTTGCTCGTGATTTGTCACTC - 7620 
-DTFCTGSTFISDEVARDLSL 
-THFAbVVHSLVMKLLVICHS 
HILHW*YIH***SCS*FVTP 

7 621 - CAGTTTAAAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGTGTTGCT - 7 68 0 
-QFKRP INPTDQS SY IVDSVA 

- SLKDQSTLLTSHRILLIVLL 

V*KTNQPY*PVIVYC**CCC 
7681 - GTGAAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGA - 7740 

- V K N GALHLYFDKAG QKTYER 

- * KMARFTSTLTRLVKRPMRD 

EKWRASPLL* QGWSKDL*ET 
77 41 - CATCCGCTCTCCCATTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCA - 7 8 00 
-HPLSH FVNLDNLRANNTKGS 
-IRSPILSI*TI*ELTTLKVH 
SALPFCQFRQFES * Q H * R F T 
7801 - CTGCCTATTAATGTCATAGTTTTTGATGGCAAGTCCAAATGCGACGAGTCTGCTTCTAAG - 7860 
-LP INVIVFDGKSKCDESASK 
-CLLMS * FLMASPNATSLLLS 
AY*CHSF*WQVQMRRVCF*V 
7 861 - TCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAACCTATTCTGTTGCTTGACCAAGCT - 792 0 
-SASVYYSQLMCQPI LLLDQA 
-LLLCTTVS *CANLFCCLTKL 
CFCVLQSADVPTYSVA*PSS 
7 921 - CTTGTATCAAACGTTGGAGATAGTACTGAAGTTTCCGTTAAGATGTTTGATGCTTATGTC - 7 98 0 

- L V S N V GDSTEV SVKMFDAYV 
-LYQTLEIVLKFPLRCLMLMS 

CIKRWR*Y*SFR*DV*CLCR 

7 981 - GACACCTTTTCAGCAACTTTTAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTACA - 804 0 

-DTFSATFSVPMEKLKALVAT 
T PFQQLLVFLWKNLRHLLLQ 
HLFSNF*CSYGKT'* GTCCYS 
8041 - GCTCACAGCGAGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCA - 8100 
-AH S ELAKGVALDGV LSTFVS 
-LTAS*QRV*L*MVSFLHSCQ 
SQRVSKGCSFRWCPFYIRVS 
8101 - GCTGCCCGACAAGGTGTTGTTGATACCGATGTTGACACAAAGGATGTTATTGAATGTCTC - 8160 
-AARQGVVDTDVDTKDVI ECL 

- LPDKVLLI PMLTQRMLLNVS 

CPTRCC*YRC*HKGCY*MSQ 
8161 - AAACTTTCACATCACTCTGACTTAGAAGTGACAGGTGACAGTTGTAACAATTTCATGCTC - 8220 
-KLSHH SDLEVTGDS CMNFML 
-NFH ITLT*K*QVTVVTISCS 
TFTSL*LRSDR*QL*QFHAH 
8221 - ACCTATAATAAGGTTGAAAACATGACGCCCAGAGATCTTGGCGCATGTATTGACTGTAAT - 8280 
-TYNKVENMTPRDLGAC I DCN 
-PIIRLKT*RPEILAHVLTVM 
L * * G*KHDAQRSWRMY*L*C 
8281 - GCAAGGCATATCAATGCCCAAGTAGCAAAAAGTCACAATGTTTCACTCATCTGGAATGTA - 8340 
-ARHINAQVAKSHNVSLIWNV 
-QGISMPK*QKVTMFHSSGM* 
KAYQCPSSKKSQCFTHLECK 

8 341 - AAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATICGTACTGCTGCCAAGAAG - 8 4 00 

-KDYMS LSEQLRKQI RTAAKK 
-KTTC LYLNSCVNKFVLLPRR 
RLHVFI*TAA*TNSYCCQEE 
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8401 - AACAACATACCTTTTACACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAACT - 8460 

- N N IP FTLTCATTRQVVNVI T 
-TTYLLH^LVLQLDRLSMS * L 

QHTFYTNLCYN* TGCQCHNY 
8461 - ACTAAAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTTGTTTTAAACTTATGCTTAAG - 8520 
-TKISLKGGKIVSTCITKLMLK 
-LKSHSRVVRLLVLVLN lclr 
*NLTQGW*DC*YLF*TYA*G 
8521 - GCCACATTATTGTGCGTTCTTGCTGCATTGGTTTGTTATATCGTTATGCCAGTACATACA - 8580 
-ATLLCVLAALVCYIVMPVHT 
-PHYCAFLLHWFVI SLCQYIH 
HIIVRSCCIGLLYRYASTYI 
8 581 - TTGTCAATCCATGATGGTTACACAAATGAAATCATTGGTTACAAAGCCATTCAGGATGGT - 864 0 
-LSIHDGYTNEIIGYKAIQDG 

- CQSMMVTQMKSLVTKP F R M V 

VNP*WLHK*NHWLQSHSGWC 
8 641 - GTCACTCGTGACATCATTTCTACTGATGATTGTTTTGCAAATAAACATGCTGGTTTTGAC - 87 00 
-VTRDI I STDDCFANKHAGFD 
-SLVTSFLLMIVIaQINMLVLT 
HS*HHFY**LFCK*TCWF*R 
8701 - GCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAAAGCTGCCCTGTAGTAGCT - 8760 
-AWFSQRGGSYKNDKSCPVVA 
-HGLASVVVHTKMTKAAL* * L 
MV*PAWWFIQK*QKLPCSSC 
87 61 ~ GCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCTTACCGGGTACTGTGCTGAGA - 8 82 0 
-A I ITREIGFIVPGLPGTVLR 
-LSLQERLVS*CLAYRVI,C*E 
YHYKRDWFHSAWLTGYCAES 
8 821 - GGAATCAATGGTGACTTCTTGCATTTTCTACCTCGTGTTTTTAGTGCXGTTGGCAACATT - 88 8 0 

- A I N G D FLHFLPRVFSAVGN I 
-QSMVTSCIFYLVF LVLLATF 

NQW*LLAFSTSCP*CCWQHL 
8881 - TGCTACACACCTTCCAAACTCATTGAGTATAGTGATTTTGCTACCTCTGCTTGCGTTCTT - 894 0 
-CYTPS KL I E Y S DFATSACVL 
-ATHLPNSLSIVILLPLLAFL 
LHTFQTH*V**FCYLCLRSC 
8 941 - GCTGCTGAGTGTACAATTTTTAAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGAC - 9000 
-AAECT I FKDAMGKPVPYCYD 
-LLSVQFLRMLWANLCHIVMT 
C*VYNF*GCYGQTCAILL*H 
9001 - ACTAATTTGCTAGAGGGTTCTATTTCTTATAGTGAGCTTCGTCCAGACACTCGTTATGTG - 9060 
-TNLLEGSISYSELRPDTRYV 
-LIC*RVLFLIVSFVQTLVMC 
*FARGFYFL* *ASSRHSLCA 
9061 - CTTATGGATGGTTCCATCATACAGTTTCCTAACACTTACCTGGAGGGTTCTGTTAGAGTA - 9120 
-LMDGS I IQFPNTYLEGSVRV 
-LWMVPSYSFLTLTWRVLLE* 
YGWFHHTVS*HLPGGFC*SS 
9121 - GTAACAACTTTTGATGCTGAGTACTGTAGACATGGTACATGCGAAAGGTCAGAAGTAGGT ~ 918 0 
-VTTFDAEYCRHGTCERSEVG 
-*QLLMLSTVDMVHAKGQK*V 
N N F * C*VL*TWYMRKVRSRY 
9181 - ATTTGCCTATCTACCAGTGGTAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCA - 924 0 
-ICLST SGRWVLNNE HYRALS 
-FAYLPVVDGFLIMSITELYQ 
LPIYQW*MGS***ALQSSIR 
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9241 - GGAGTTTTCTGTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTG - 9300 
-GVFCGVDAMNLIAN I FTPLV 
-EFSVVLMR*IS*LTSLLLLC 
SFLWC*CDESHS*HLYSSCA 
9301 - CAACCTGTGGGTGCTTTAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATA - 93 60 
-QPVGALDVSASVVAGGIIAI 
-NLWVL.*MCLLQ*WLVVLLPY 
TCGCFRCVCFSSGWWYYCHI 
9361 - TTGGTGACTTGTGCTGCCTACTACTTTATGAAATTCAGACGTGTTTTTGGTGAGTACAAC - 9420 
-LVTCAAY Y FM KFRRVFGEYN 
-W*LVLPTTL*NSDVFLVSTT 
GDLCCLLLYEIQTCFW*VQP 
94 21 - CATGTTGTTGCTGCTAATGCACTTTTGTTTTTGATGTCTTTCACTATACTCTGTCTGGTA - 9480 
-HVVAANALLFLMSFTI LCLV 
-MLLLLMHFCF*CLSLYSVWY 
CCCC * CTFVFDVFHYTLSGT 

94 81 - CCAGCTTACAGCTTTCTGCCGGGAGTCTACXCAGTCTTTTACTTGTACTTGACATTCTAT - 954 0 

-PAYS FLPGVYSVFYLYLTFY 
-QLTAFCRESTQSFTCT*HSI 
SLQLSAGSLLSLLLVLDILF 

95 41 - TTCACCAATGATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT - 9600 

-FTNDV SFLAHLQWFAMFS P I 
-5PMM FHSWLTFNGLPCFLLL 
HQ*CFILGSPSMVCHVFSYC 
9601 - GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATTGG - 9660 
-VPFWI TAIYVFC ISLKHCHW 
-CLFG*QQSMYSVFL*STAIG 
AFLDNSNLCILYFSEALPLV 
9661 - TTCTTTAACAACTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTC - 9720 
-FFNNY LRKRVM FNGVT FSTF 
-SLTT ILGKESCLMELHLVPS 
L*QLS*EKSHV*WSYI*YLR 
9721 - GAGGAGGCTGCTTTGTGTACCTTTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGC - 9780 
-EEAALCTFLLNKEMYLKLRS 
-RRLLCVPFCSTRKCT*NCVA 
GGCFVYLFAQQGNVPKIA*R 
9781 - GAGACACTGTTGCCACTTACACAGTATAACAGGTATCTTGCTCTATATAACAAGTACAAG - 9 84 0 
-ETLLP LTQYNRYLALYNKYK 
-RHCCHLHSITGILLYITSTS 
DTVATYTV*QVSCSI*QVQV 
9841 - TATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGCTTGCTGCCACTTAGCA - 9900 
-YFSGALDT TSYREAACCHLA 

- I S V E P * ILPAIVKQLAAT * Q 

FQWSLRYYQLS* SSLLPLSK 
9901 - AAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACCACCACAGACA - 9960 
-KALND FSNSGADVLYQ PPQT 
-RL^MTLATQVLMFSTNHHRH 
GSK*L*QLRC*CSLPTTTDI 
9961 - TCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAATGGCATTCCCGTCAGGCAAA - 10020 

- S ITSAVLQSGFRKMAFPSGK 
-QSLLLFCRVVLGKWHSRQAK 

NHFCCSAEWF*ENGI PVRQS 
10021 - GTTGAAGGGTGCATGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTG - 10080 
-VEGCMVQVTCGTTTLNGLWL 
-LKGAWYK* PVELQLLMDCGW 

*RVHGTSNLWNYNS*WIVVG 
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10 081 - GATGACACAGTATACTGTCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCT - 10140 
-DDTVYCPRHVI CTAEDMLNP 
-MTQYTVQDMSFAQQKTCLIL 
*HSILSKTCHLHSRRHA*S* 
10141 - AACTATGAAGATCTGCTCATTCGCAAATCCAACCATAGCTTTCTTGTTCAGGCTGGCAAT - 10200 
-NYEDLLIRKSNHSFLVQAGN 
-TMKICSFANPTIAFLFRLAM 
L*RSAHSQIQP*LSCSGWQC 
10201 - GTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGTCTGCTTAGGCTTAAAGTTGAT - 10260 

- V Q L R V IGHSMQNCLLRLKVD 

- FNFVLLAI LCKIVCLGL 'KLI 

STSCYWPFYAKLSA*A*S*Y 
10261 - ACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTATCCAACCTGGTCAAACATTT - 10320 
-TSNPKTPKYKFVRI QPGQTF 
-LLTLRHPS INLSVSNLVKHF 
F*P*DTQV*ICPYPTWSNIF 
10321 - TCAGTTCTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCT - 10380 
-SVLACYNG SPSGVYQCAMRP 
-QF*HATMVHHLVFISVP* DL 
SSSMLQWFTIWCLSVCHET* 
10381 - AATCATACCATTAAAGGTTCTTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATT - 104 40 
-NHTIKGSFLNGSCGSVGFNI 

- I I PLKVLSLMDHVVVLVLTL 

SYH*RFFP*WIMW*CWF*H* 
10441 - GATTATGATTGCGTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACAC - 10500 
-DYDCVSFCYMHHMELPTGVH 
-IMIACLSAICIIWSFQQEYT 
L*LRVFLLYASYGASNRSTR 
10501 ~ GCTGGTACTGACTTAGAAGGTAAATTCTATGGTCCATTTGTTGACAGACAAACTGCACAG - 10560 
-AGTDLEGKFYGP FVDRQTAQ 
-LVLT*KVNSMVHLLTDKLHR 
WY*LRR*ILWSIC*QTNCTG 
10561 - GCTGCAGGTACAGACACAACCATAACATTAAATGTTTTGGCATGGCTGTATGCTGCTGTT - 10620 
-AAGTDTT I TLNVLAWLYAAV 
-LQVQTQP*H*MFWHGCMLLL 
CRYRHttHNIKCFGMAVCCCY 
10621 - ATCAATGGTGATAGGTGGTTTGTTAATAGATTCACCACTACTTTGAATGACTTTAACCTT - 10680 
-INGDRWFLNRFTTTLNDFNL 

- SMVIGGFLI DSPIjL*MTLTL 

QW**VVS**1HHYFE*L*PC 
10681 - GTGGCAATGAAGTACAACTATGAACCTTTGACACAAGATCATGTTGACATATTGGGACCT - 10*740 
-VAMKYNYE PLTQDHVDILGP 
-WQ*STTMNL*HKIMIjTYWDL 
G N E V Q L * TFDTRSC*HIGTS 
10741 - CTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTG - 10800 
-LSAQT GIAVL DMCAALKELL 
-FLLKQELPS*ICVLL*KSCC 
FCSNRUCRLRYVCCFERAAA 
10801 - CAGAATGGTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGATGAGTTTACA - 10860 
-QNGMNGRT ILGST ILEDE FT 
-RMV*MVVLSLVALF*KMSLH 
EWYEWSYYPW* HYFRR*VYT 
108 61 - CCATTTGATGTTGTTAGACAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATT - 10920 
-PFDVVRQC SGVTFQGKFKKI 
-HLMLLDNALVLPSKVSSRKL 
I*CC*TMLWCYLPR*VQENC 
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10921 - GTTAAGGGCACTCATCATTGGATGCTTTTAACTTTGTTGACATCACTATTGATTCTTGTT - 10980 
-VKGTHHWMLLTFLT SLLILV 
~LRALIIGCF*LS^HHY"FLF 
*GHS SLDAFNFLDITIDSCS 
10981 - CAAAGTACACAGTGGTCACTGTTTTTCTTTGTTTACGAGAATGCTTTCTTGCCATTTACT - 11040 
-QSTQWSLFFFVYENAFLPFT 
-KVHSGHCFSLFTRMLSCHLL 
KYTVVTVFLCLRECFLAIYS 
11041 - CTTGGTATTATGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATAAGCACGCATTC - 11100 
-LGIMAIAACAMLLVKHKHAF 
-LVLWQLLHVLCCLLS I STBS 
WYYGNCCMCYAAC*A*ARIL 
11101 - TTGTGCTTGTTTCTGTTACCTTCTCTTGCAACAGTTGCTTACTTTAATATGGTCTACATG - 11160 
-LCLFLLPSLATVAY FNMVYM 

- CACFCYLLLQQLLTLIW STC 

VLVSVTFSCNSCLL* YGLHA 
11161 - CCTGCTAGCTGGGTGATGCGTATCATGACATGGCTTGAATTGGCTGACACTAGCTTGTCT - 11220 
-PASWVMRIMTWLELADTSLS 
-LLAG*CVS*HGLNWLTLACL 
C * LGDAYHDMA* IG*H*LVW 
11221 ~ GGTTATAGGCTTAAGGATTGTGTTATGTATGCTTCAGCTTTAGTTTTGCTTATTGTCATG - 11280 

- G Y RLKDCVMYAS ALVLLI LM 
-VI GLRIVLCMLQL* F C L F S * 

L*A*GLCYVCFSFSFAYSHD 
11281 - ACAGCTCGCACTGTTTATGATGATGCTGCTAGACGTGTTTGGACACTGATGAATGTCATT - 11340 
-TARTVYDDAARRVWTLMNVI 

- QLALFMMMLLDVFGH* * M S L 

SSHCL**CC*TCLDTDECHY 
11341 - ACACTTGTTTACAAAGTCTACTATGGTAATGCTTTAGATCAAGCTATTTCCATGTGGGCC - 11400 

- T L V YKVYYGNAL DQAI SMWA 
-HLFTKSTMVML* IKLFPCGP 

TCLQSLLW*CFRSSYFHVGL 
11401 - TTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTATCATGTTTTTAGCT - 114 60 

- L V I SVTSNYSGVVTTIMFLA 
-*LFL*PLTIL,VSLRLSCF*L 

SYFCNL*LFWCRYDYHVFS* 
11461 - AGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAACACC - 11520 
-RAIVFVCVEYYPLLFITGNT 
-EL^CLCVLSITHCYLLLATP 
SYSVCVC*VLPIVIYYWQHL 
11521 - TTACAGTGTATCATGCTTGTTTATTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGC ~ 11530 

- L Q C IMLVYCFLGYCCCCYFG 
-YSVSCLFIVS*AIVAAATLA 

TVYHACLLFLRLLLLLL'LWP 
11581 - CTTTTCTGTTTACTCAACCGTTACTTCAGGCTTACTGTTGGTGTTTATGACTACTTGGTC - 11640 
-LFC LLNRY FRLTLGVYDYLV 
-FSVY STVTSGLLLVFMTTWS 
FLFTQPLLQAYSWCL*LLGL 
11641 - TCTAGACAAGAATTTAGGTATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATT - 11700 

- S TQE FRYMNSQGLL PPKSS I 
-LHKNLGI*TPRGFCLLRVVL 

YTRI*VYELPGAFAS*E*Y* 
11701 - GATGCTTTCAAGCTTAACATTAAGTTGTTGGGTATTGGAGGTAAACCATGTATCAAGGTT - 11760 
-DAFKLNIKLLGI GGKPCIKV 
-MLSS LTLS'CWVLEVNHVSRL 

C F Q A * H*VVGYWR* TMYQGC 
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11761 - GCTACTGTACAGTCTAAAATGTCTGACGTAAAGTGCACATCTGTGGTACTGCTCTCGGTT - 11820 
-ATVQSKMSDVKCTSVVLLSV 
-LLYSLKCLT* SAHLWYCSRF 
YCTV*NV*RKVHICGTALGS 
11821 - CTTCAACAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTACAACTCCAC - 11880 
-LQQLRVESSSKLWAQCVQLH 
-FNNLE * SHIiLNCGHNVYNST 
STT*SRVIF*IVGTMCTTPQ 
11881 - AATGATATTCTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTG - 11940 

- N D I LLAKDTTEAFE KMVSLL 

- M I FFLQKTQLKLSRRW FLFC 

* YSSCKRHN* SFREDGFSFV 
11941 - TCTGTTTTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTC - 12000 
-SVLLSMQGAVDINRLCEEML 

- L F C Y PCRVL*TLIGCARKCS 

CFAIHAGCCRH* *VVRGNAR 
12001 - GATAACCGTGCTACTCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACCATCATATGCC - 12060 

- D N R A T LQAIASEFS SLPSYA 
-ITVLLFRLLLQNLVLYHHMP 

*PCYSSGYCFRI*FFTIICR 
12061 - GCTTATGCCACTGCCCAGGAGGCCTATGAGCAGGCTGTAGCTAATGGTGATTCTGAAGTC - 12120 
-AYATAQEAYEQAVAN GDSEV 
-LMPLPRRPMSRL*LMVILKS 
LCHCPGGL*AGCS*W*F*SR 
12121 - GTTCTCAAAAAGTTAAAGAAATCTTTGAATGTGGCTAAATCTGAGTTTGACCGTGATGCT - 12180 
-VLKKLKKSLNVAKSE FDRDA 
-FSKS*RNL*MWLNLSLTVMI J 
SQKV r KEIFECG*I*V*P*CC 
12181 - GCCATGCAACGCAAGTTGGAAAAGATGGCAGATCAGGCTATGACCCAAATGTACAAACAG - 12240 
-AMQRKLEKMADQAMTQMYKQ 
-PCNASWKRWQIRL* PKCTNR 
HATQVGKDGRSGY DPNVQTG 
12241 - GCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTCACT - 12300 
-ARSEDKRAKVTSAMQTMLFT 
-QDLRTRGQK*LVLCKQCSSL 
KI * GQEGKSN*CYANNALHY 
12301 - ATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGCGCGTGATGGT - 123 60 
-MLRKL DN DALNN I INNARDG 
-CLGSLIMMHLTTLSTMRVMV 
A * E A * * *CT*QHYQQCA*WL 
12361 - TGTGTTCCACTCAACATCATACCATTGACTACAGCAGCCAAACTCATGGTTGTTGTCCCT - 12420 
-CVPLNI IPLTTAAKLMVVVP 
~VFHSTSYH*LQQPNSWLLSI> 
CSTQHHTIDYSSQTHGCCP* 
12421 - GATTATGGTACCTACAAGAACACTTGTGATGGTAACACCTTTACATATGCATCTGCACTC - 12480 
-DYGTYKNTCDGNT FTYASAL 

- IMVPTRTLVMVTPLHMHLHS 

LWYLQEHL*W*HLYICICTL 
12 4 81 - TGGGAAATCCAGCAAGTTGTTGATGCGGATAGGAAGATTGTTCAACTTAGTGAAATTAAC - 1254 0 
-WE I QQVVDADSKIVQLSE IN 
-GKSSKLLMRIARLFNLVKLT 
GNPASC*CG*QDCST**N*H 
12541 - ATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTGTTACAGCTCTAAGAGCCAACTCA - 12600 

- M D N S PNLAWPLIVTALRANS 

- WTIHQIWLGLLLLQL*EPTQ 

GQFT KFGLASYCYSSKSQLS 
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12601 - GCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGATGTCCTGTGCG - 12660 
-AVKLQNNELSPVALRQMSCA 
-LLNYRIMN*VQ*HYDRCPVR 
C*TTB* ^TESSSTTTDVLCG 

12 661 - GCTGGTACCACACAAACAGCTTGTACTGATGACAATGCACTTGCCTACTATAACAATTCG - 12720 
-AGTTQTACT DDNALAYYNN S 

- LVPHKQLjVLMTMHLPTIT IB. 

WYHTNSLY* *QCTCLL*QFE 
12721 ~ AAGGGAGGTAGGTTTGTGCTGGCATTACTATCAGACCACCAAGATCTCAAATGGGCTAGA - 12780 
-KGGRFVLALLSDHQDLKWAR 
-REVGLCWHYYQ TTKI SNGLD 
GR*VCAGITIRPPRSQMG*I 
12781 - TTCCCTAAGAGTGATGGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTT - 1284 0 
-FPKSDGTGTIYTELEPPCRF 
SLRVMVQVQFTQNWNHLVGL 
p*E*WYRYNLHRTGTTL*VC 
12841 - GTTACAGACACACCAAAAGGGCCTAAAGTGAAATACTTGTACTTCATCAAAGGCTTAAAC - 12 900 
-VT DT PKGPKVKYLYFI KGLN 
-LQTHQKGLK*NTCTSSKA*T 
YRHTKRA* SEILVLHQRLKQ 
12901 - AACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTGCTACAGTACGTCTTCAGGCTGGA - 12960 
-NLNRGMVLGSLAATVRLQAG 

- T*IEVWCWAV*LLQYVFRIiE 

PK*RYGAGQFSCYSTSSGWK 
12961 - AATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCTTCTGTGCTTTTGCAGTAGAC - 13020 
-NATEV PANSTVLSFCAFAVD 
-MLQKYLPIQLCFPSVLLQ*T 
CYRSTCQFNCAFLLCFCSRP 
13021 - CCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGTGTG - 1308 0 
-PAKA YKDYLASGGQP I T IS! C V 
-LLKHIRIT*QVEDNQSPTV* 
C*SI*GLPSKWRTTNHQLCE 
13081 - AAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAAC - 13140 
-KMLCT HTGTGQAITVT P E A N 
-RCCVHTLVQDRQLL*HQKLT 
DVVYTHWYRTGNYCNTRS*H 
13141 - ATGGACCAAGAGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGAC - 13200 
-MDQES FGGASCCLYCRCHI D 
-WTKSPLVVLHVVCIVDATLT 
GPRVLWWCFMLSVL*MPH*P 
13201 - CATGCAAATGCTAAAGGATTCTGTGACTTGAAAGGTAAGTACGTCCAAATACCTACCACT - 13260 
-HPNPKGFCDLKGKYVQI PTT 

- IQILKDSVT*KVSTSKYLPL 

S K S * RIL*LER*VRPNTYHL 
13261 - TGTGCTAATGACCCAGTGGGTTTTACACTTAGAAACACAGTCTGTACCGTCTGCGGAATG - 13320 
-CANDPVGFTLRNTVCTVCGM 
-VLMTQWVLHLETQSVPSAEC 
C* + PSGFYT*KHSIjYRLRNV 
13321 - TGGAAAGGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCTTGATGCAGTCTGCGGAT - 13380 
-WKGYG CSCDQLREPLMQSAD 
-GKVMAVVVTNSANP*CSLRM 
ERLWL*L^PTPRTLDAVCGC 
13381 - GCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCA - 13440 
-ASTFLNGFAV*VQPVLHRAA 

- HQRF*TGLRCKCSPSYTVRH 

INVFKRVCGVSAARLTPCGT 
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13441 - CAGGCACTAGT ACTGAT GTCGTCT ACAGGGCT TTTGATAT TTACAACGAAAAAAGTGCTG - 13500 
-QALVLMSSTGLLI fttkkvl 
-RH*Y*CRLQGF*YLQRKKCW 
GTSTDVVYRAFDIYNEKSAG 
13501 - GTTTTGCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGATGAGGAAGGCA - 13560 
-VLQSS *KLIAVASRRRMRKA 
-FCKVPKN* LLSLPGEG*GRQ 
FAK FLKTNCCRFQEKDEEGN 
135 61 - ATTTATTAGACTCTTACTTTGTAGTTAAGAGGCATACTATGTCTAACTACCAACATGAAG - 1362 0 
-IY*TLTL*LRGILCLTTNMK 
-FIRLLLCS*EAYYV*IiPT*R 
LLDSYFVVKRHTMSNYQHEE 
13 621 - AGACTATTTATAACTTGGTTAAAGATTGTCCAGCGGTTGCTGTCCATGACTTTTTCAAGT - 13680 
-RLFITWLKIVQRLLSMTFSS 
-DYL*LG*RLSSGCCP*LFQV 
TIYNLVKDCPAVAVHDFFKF 
13 681 - TTAGAGTAGATGGTGACATGGTACCACATATATCACGTCAGCGTCTAACTAAATACACAA - 137 4 0 
-LE*MVTWYH I Y H V S V * LNTQ 
-*SRW*HGTTYITSASN*IHN 
RVDGDMVPHI SRQRLTKYTM 
13741 - TGGCTGATTTAGTCTATGCTCTAGGTCATTTTGATGAGGGTAATTGTGATACATTAAAAG - 13800 
-WLI*SMLYVILMRVIVIH*K 
-G*FSLCSTSF**G*L*YIKR 
ADLVYALRHFDEGNCDTLKE 
13801 - AAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG - 138 60 
-KYSSHTIAVMMI IS IRRIGM 
-NTRHIQLL* * * LFQ*EGLV* 
I LVTYNCCDDDYFNKKDWYD 
13861 - ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCC - 13920 
-TS*RILTSYAYMLT*VSVYA 
-LRRES*HLTRIC*LR*ACTP 
FVENPDI LRVYANLGERVRQ 
13921 - AATCATTATTAAAGACTGTACAATTCTGCGATGCTATGCGTGATGCAGGCATTGTAGGCG - 13980 
-NHY*RLYNSAMLCVMQAL*A 
-IIIKDCTILRCYA*CRHCRR 
SLLKTVQFCDAMRDAGIVGV 
13981 - TACTGACATTAGATAATCAGGATCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTAC - 14040 
- Y * H * I IRILMGTGTISVI SY 
-TDIR*SGS*WELVRFR*FRT 
LTLDNQDLNGNWYDFGDFVQ 
14041 - AAGTAGCACCAGGCTGCGGAGTTCCTATTGTGGATTCATATTACTCATTGCTGATGCCCA - 14100 
-K*HQAAEFLLWIHITHC* CP 
-SSTRLRSSYCGFILLIADAH 
VAPGCGVPIVDSYYSLLMPI 
14101 - TCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGATGCTGATCTCGCAAAAC - 14160 
-SSL*LGHWLLSPIWMLISQN 
-PHFD*GIGC*VPYGC*SRKT 
LTLT RALAAESHMDADLAKP 
14161 - CACTTATTAAGTGGGATTTGCTGAAATATGATTTTACGGAAGAGAGACTTTGTCTCTTCG - 14220 
-HLLSGIC*NMI LRKRDFVSS 
-TY*VGFAEI *FYGRETLSLR 
LIKWDLLKYDFTEERLCLFD 
14221 - ACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACTGTTTGGATG - 14280 
-TVI LN IGTRHT I PIVLTVWM 
P L F * ILGPDIPSQLY* L F G * 
RYFKYWDQTYHPNCINCLDD 
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14281 - ATAGGTGTATCCTTCATTGTGCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTA - 14340 
-IGVSFIVQTLMCYFLLCFHL 
-*VYPSLCKL*CVIFYCVSTY 
RCILHCANFNVLFSTVFPPT 
14 341 - CAAGTTTTGGACCACTAGTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAA - 14400 
-QVLDH * * E K Y L * MVFLLLFQ 
-KFWTTSKKNICRWCSFCCFN 
SFGPLVRKIFVDGVPFVVST 
14401 - CTGGATACCATTTTCGTGAGTTAGGAGTCGTACATAATCAGGATGTAAACTTACATAGCT - 144 60 
-LDTIFVS*ESY3 IRM*TYIA 
-WIPFS*VRSRT*SGCKLT*L 
GYHFRELGVVHNQDVNLHSS 
14461 - CGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTGATCCAGCTATGCATGCAGCTT - 14520 

- R V SVS RNF*CMLLI QLCMQL 
-ASQFQGTFSVCC* SSYACSF 

RLSFKELLVYAADPAMHAAS 
14 521 - CTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCACTAACAAACA - 14580 
-LAIYC*INALHAFQ*LH*QT 

- WQFIAR*THYMLFSSCTNKQ 

GNLLLDKRTTCFSVAALTNN 
14 581 - ATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTG - 14640 
-MLLFKLS'N PVILIKT FMTLL 
-CCFSNCQTR*F**RLL*LCC 
VAFQTVKPGNFNKDFYDFAV 
14 641 - TGTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTC - 14700 
-CLKVSLRKEVLLtSl * NTSSLL 

- V * RFL*GRKFC* TKTLLLCS 

SKGFFKEGSSVELKHFFFAQ 
14701 - AGGATGGCAACGCTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGT - 147 60 
-RMATLLSVIMTIIVIICQQC 

- GWQRCYQ*L*LLSIi*SANNV 

DGNAAISDYDYYRYNLPTMC 
14761 - GTGATATCAGACAACTCCTATTCGTAGTTGAAGTTGTTGATAAATACTTTGATTGTTACG - 14820 
-V I S DNSYS *LKLLINTLIVT 
~*YQTTPIRS*SC*^ I L * L L R 
DIRQLLFVVEVVDKYFDCYD 
14821 - ATGGTGGCTGTATTAATGCCAACCAAGTAATCGTTAACAATCTGGATAAATCAGCTGGTT - 14880 
-MVAVLMPTK* SLTIWINQLV 
-WWLY*CQPSNR*QSG*ISWF 
GGCIMANQVIVNNLDKSAGF 
14 881 - TCCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAATGAGTTATGAGGATC - 14 940 
-SHLINGVRLDFIMTQ*VMRI 
-PI**MG*G*TLL*LNEL*GS 
PFNKWGKARLYYDSMSYEDQ 
14 941 - AAGATGCACTTTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATGAATC - 15000 
-KMHFSRILSVMSSLL*LK*I 
-RCTFRVY*A*CHPYYNSNES 
DALFAYTKRNVI PTITQMNL 
15001 - TTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTA - 15060 
-LSMPLVQRI "ELAP*LVSLSV 
-*VCH*CKE*SSHRSWCLYL* 
KYAI SAKNRARTVAGVS I CS 
15061 - GTACTATGACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGCCGCCACTAGAG - 15120 
-VL*QIDSFIRNY* S Q * P P L E 
-YYDK*TVSSEIIEVNSRH*R 
TMTTSJRQFHQKLLKSIAATRG 
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15121 - GAGCTACTGTGGTAATTGGAACAAGCAAGTTTTACGGTGGCTGGCATAATATGTTAAAAA - 15180 
-ELLVJ* LEQASFTVAGI I C * K 
-SYCGNWNKQVLRWLA* YVKN 
ATVVIGTSKFYGGWHNMLKT 

15181 ~ CTGTTTACAGTGATGTAGAAACTCCACACCTTATGGGTTGGGATTATCCAAAATGTGACA - 15240 

- L F T V M * KLHTLWVG I I QNVT 
-CLQ*CRNSTPYGI»GLSKM*Q 

V Y S DVETPHLMGWDYPKCDR 
15241 - GAGCCATGCCTAACATGCTTAGGATAATGGCCTCTCTTGTTCTTGCTCGCAAACATAACA - 15300 
-EPCLT C L G * WPLLFLI>AN I T 
-SHA* HA*DNGLSCSCSQT*H 
AMPNMLRIMASLVLARKHNT 
15301 - CTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGTGCGCAAGTATTAA - 15360 
-LAVTY HTVSTG* LTSVRKY* 
-LL*LITPFLQVS*RVCASIK 
CCNLSHRFYRLANECAQVLS 
15361 - GTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACCAGGTGGAACATCATCCGGTG - 15420 
-VRWSCVAAHYMLNQVEHHPV 
-*DGHVWRLTIC*TRWNIIR* 
EMVMCGGSLYVKPGGTSSGD 
15421 - ATGCTACAACTGCTTATGCTAATAGTGTCTTTAACATTTGTCAAGCTGTTACAGCCAATG - 154 80 
-MLQLLMLIVSLT FVKLLQPM 

- CYNCLC**CL*HLSSCYSQC 

ATTAYANSVFNI CQAVTANV 
15481 - TAAATGCACTTCTTTCAACTGATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTAC - 15540 

- * M H FFQLMVIR* LT SMSAIY 
-KCTSFN*W**DS*QVCPQST 

NALLSTDGNKIADKYVRNLQ 
15541 - AACACAGGCTCTATGAGTGTCTCTATAGAAATAGGGATGTTGATCATGAATTCGTGGATG - 15600 
-NTGSMSVSIEIGMLIMNSWM 
-TQAL*VSL*K*GC*S* I R G * 
HRLYECLYRNRDVDHEFVDE 
15601 - AGTTTTACGCTTACCTGCGTAAACATTTCTCCATGATGATTCTTTCTGATGATGCCGTTG - 15660 
-SFTLTCVNISP* * FFLMMPL 
-VLRLPA*TFLHDDSF* * C R C 
FYAYLRKHFSMMILSDDAVV 
15661 - TGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGCTAGCATTAAGAACTTTAAGG - 15720 

- C A I TVTMRLKV* * LALRTLR 
-VL*Q*LCGSRFSS*H*EL*G 

CYNSNYAAQGLVAS IKNFKA 
15721 - CAGTTCTTTATTATCAAAATAATGTGTTCATGTCTGAGGCAAAATGTTGGACTGAGACTG - 15780 
-QFFI IKIMCSCLRQNVGLRL 

- S5LLSK*CVHV*GKMLD*D* 

VLYYQNNVFMSEAKCWTETD 
15781 - ACCTTACTAAAGGACCTCACGAATTTTGCTGACAGCATACAATGCTAGTTAAACAAGGAG - 15840 
-TLLKDLTNFAHS I QC * LNKE 
P Y * RTSRILLTAYNAS * T R R 
LTKGPHSFCSQHTMIiVKQGD 
15841 - ATGATTACGTGTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTG - 15900 
-MITCTCLTQIHQEY*AQAVL 
-^LRVPALPRSIKNIRRRLFC 
DYVYLPYPDPSRIIjGAGCFV 
15 901 - TCGATGATATTGTCAAAACAGATGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTA - 159 60 
-SMILSKQMVHL*LKGSCHWL 
-R*YCQNRWYTYD*KVRVTGY 
DDIVKTDGTLMIERFVSLAI 
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15961 - TTGATGCTTACCCACTTACAAAACATCCTAATCAGGAGTATGCTGATGTCTTTCACTTGT - 16020 
-LMLTHLQNILIRSMLMSFTC 

- *CLPTYKTS*SGVC*CIjSLV 

DAYPLTKHPNQEYADVFHLY 
16021 - ATTTACAATACATTAGAAAGTTACATGATGAGCTTACTGGCCACATGTTGGACATGTATT - 16080 
-IYNTLESYMMSLLATCWTCI 
-FTIH*KVT* *AYWPHVGHVF 
LQYIRKLHDELTGHMLDMYS 
16081 - CCGTAATGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTATGAGGCTA - 16140 
-p*C* LMIT PHGTGNLS FMRL 

- R N A N * * *HLTVLGT*VL*GY 

VMLTNDNTSRYWEPEFYEAM 
16141 - TGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGA - 3 6200 
-CTHH IQSCRL*VLVYCAIHR 
-VHTTYSLAGCRCLCIVQFTD 
YTPHTVLQAVGACVLCNSQT 
16201 - CTTCACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAGTGCTGCTATG - 16260 
-LHFVAVPVLGDHSYVASAAM 

- FTSLR'CLY * ET I PMLQVLL* 

SLRCGACIRRPFLCCKCCYD 
16261 - ACCATGTCATTTCAACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATG - 16320 
-TMSFQHHTN*"CCLLIPMFAM 
-PCHFNITQISVVC*SLCLQC 
HVISTSHKLVLSVNPYVCNA 
16321 - CCCCAGGTTGTGATGTCACTGATGTGACACAACTGTATCTAGGAGGTATGAGCTATTATT - 16380 
-PQVVMSLM*HNCI*EV*AII 
-PRL*CH*CDTTVSRRYELLL 
PGCDVTDVTQLYLGGMSYYC 
16381 - GCAAGTCACATAAGCCTCCCATTAGTTTTCCATTATGTGCTAATGGTCAGGTTTTTGGTT - 16440 
-ASHISLPLVFHYVLMVRFLV 
~QVT*ASH*FSIMC*-WSGFWF 
KSHKPPISFPLCANGQVFGL 
16441 - TATACAAAAACACATGTGTAGGCAGTGACAATGTCACTGACTTCAATGCGATAGCAACAT - 16500 
-YTKTHV*AVTMSLTSMR*QH 
-IQKHMCRQ*QCH*LQCDSNM 
YKNTCVGSDNVTDFNAIATC 
16501 - GTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAAGC - 16560 
-VI GLMLAI TYLPTLVLRDSS 
-*LD*CWRLHTCQHLY*ETQA 
DWTNAGDYILANTCTERLKL 
16561 - TTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATTG - 16620 
-FSQQKRSKPLRKHLSCHMVL 
~FRSRNAQSH*GNI * A V IWYC 
FAAETLKATEETFKLSYGIA 
16 621 - CCACTGTACGCGAAGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAAC - 16680 
-PLYAKYSLTENCIFHGRLEN 
-HCTRSTL* QRIASFMGGWKT 
TVREVLSDRELHLSWEVGKP 
16681 - CTAGACCACCATTGAACAGAAACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTA - 16740 

- L D H H * TETMSLLVTV*LKIV 

- * T T I EQKLCLYWLPCN*K* * 

RPPLNRNYVFTGYRVTKNSK 
16741 - AAGTACAGATTGGAGAGTACACCTTTGAAAAAGGTGACTATGGTGATGCTGTTGTGTACA - 16800 
-KYRLESTP LKKVTMVMLLCT 
-STDWRVHL*KR*LW*CCCVQ 

VQIGEYTFEKGDYGDAVVYR 
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16801 - GAGGTACTACGACATACAAGTTGAATGTTGGTGATTACTTTGTGTTGACATCTCACACTG ~ 16860 
-EVLRHTS*MLVITLC*HLTL 
-RYYDIQVECW*LLCVDISHC 
GTTTYKLNVGDYFVLTSHTV 

16861 - TAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCACTATGTGAGAATTACTGGCT - 16920 

- * CHLVHLL * CHKSTM*ELLA 
-NAT*CTYSSATRALCENYWL 

MPLSAPTLVPQEHYVRITGL 
16 921 - TGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGG - 16980 
-CTQHSTSQMSFLAMLQI IKR 
-VPNTQHLR^VF^QCCKLSKG 
YPTLNISDEFSSNVANYQKV 
16981 - TCGGCATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTG - 17040 
-SACKSTLHSKDHLVLVRVIL 
-RRAKVLYTPRTTWYW*ESFC 
GMQKYSTLQGPPGTGKSHFA 
17041 - CCATCGGACTTGCTCTCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATG - 17100 
-PS DLLSI T H L L A * C IRHALM 
-HRTCSLLPICSHSVYGMLSC 
IGLAIiYYPSARIVYTACSHA 
17101 - CAGCTGTTGATGCCCTATGTGAAAAGGCATTAAAATATTTGCCCATAGATAAATGTAGTA - 17160 
-QLLMPYVKRH*NICP*INVV 
-SC*CPM*KGIKIFAHR*M** 
AVDALCEKALKYLP I DKCSR 
17161 - GAATCATACCTGCGCGTGCGCGCGTAGAGTGTTTTGATAAATTCAAAGTGAATTCAACAC - 17220 
-ESYLRVRA* SVLINSK* IQH 
-NHTCACARRV F * * IQSEFNT 
IIPARARVECFDKFKVNSTL 
17221 - TAGAACAGTATGTTTTCTGCACTGTAAATGCATTGCCAGAAACAACTGCTGACATTGTAG - 17280 

- * N SMFSAL*MHCQKQLLTL* 
-RTVCFLHCKCIARNNC*HCS 

EQYVFCTVNALPETTADIVV 
17281 - TCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTTC - 17340 
-SLMKSLWt>LIMT*VLSMLDF 
-L**NLYGY *L*LECCQC* TS 
FDEISMATNYDLSVVNARLR 
17341 - GTGCAAAACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCGGCACATTGC - 17400 
-VQNTTSI IjATLLNYQP PAHC 
-CKTLRLYWRSCSITSPPHIA 
AKHYVYIGDPAQLPAPRTLL 
17401 - TGACTAAAGGCACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAA - 17460 
-*LKAH*NQNILIQCADL*KQ 
-D*RHTRTRI F * FSVQTYENN 
TKGTLEPEYFNSVCRLMKTI 
17461 - TAGGTCCAGACATGTTCCTTGGAACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTG - 17520 

- * VQTCSLELVAVVLLKLLTL 
-RSRHVPWNLSPLSC*NC*HC 

GPDMFLGTCRRCPAEIVDTV 
17521 ~ TGAGTGCTTTAGTTTATGACAATAAGCTAAAAGCACACAAGGATAAGTCAGCTCAATGCT - 17 580 
-*VL*FMTIS*KHTRISQLNA 
-ECFSL*Q*AKSTQG*VSSML 
SALVYDNKLKAHKDKSAQCF 
17581 - TCAAAATGTTCTACAAAGGTGTTATTACACATGATGTTTCATCTGCAATCAACAGACCTC - 17 640 
-SKCSTKVLLHMMFHLQSTDL 
-QNVLQRCYYT*CFICNQQTS 
KMFYKGVITHDVSSAINRPQ 
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17 641 - AAATAGGCGTTGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTGTTTTTA - 17700 
~ K * A L * ENFLHAI LLGEKLFL 
-NRRCKRISYTQSCLEKSCFY 
IGVVREFLTRNPAWRKAVFI 

17701 - TCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTGCCTACGCAGA - 17760 
-SHLIIHRTL*LQKS*DCLRR 

- L T L * FTERCSFKNLRIAYAD 

SPYNSQNAVASKI LGLPTQT 
17761 - CTGTTGATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACAAACTACTGAAA - 17820 
-LLIHHRVLNMTMSYSHKLLK 
-C*FITGF*I*LCHIHTNY*N 

VDSSQGSEYDYVIFTQTTET 

17 821 - CAGCACACTCTTGTAATGTCAACCGCTTCAATGTGGCTATCACAAGGGCAAAAATTGGCA - 178 80 

-QHTLVMSTASMWLSQGQKLA 
STLL*CQPLQCGYHKGKNWH 
AHSCNVNRFNVAI TRAKIGI 
17881 - TTTTGTGCATAATGTCTGATAGAGATCTTTATGACAAACTGCAATTTACAAGTCTAGAAA - 17940 
-FCA*CLIEI FMTNCNLQV* K 

- F V H N V * *RSL*QTAIYKSRN 

LCIMSDRDLYDKLQFTSLEI 
17941 - TACCACGTCGCAATGTGGCTACATTACAAGCAGAAAATGTAACTGGACTTTTTAAGGACT - 18000 

- Y H VAMWLHYKQKM * LDFLRT 
-TTSQCGYITSRKCNWTF*GL 

PRRNVATLQAENVTGLFKDC 
18001 - GTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCTCAGCGTTGATA - 18060 
-VVRSLLVFILHRHLHTSALI 
-**DHYWSSSYTGTYTPQR*Y 
SKI ITGLHPTQAPTHLSVDI 

18 061 - TAAAATTCAAGACTGAAGGATTATGTGTTGACATACCAGGCATACCAAAGGACATGACCT - 18120 

- * N SRLKDYVLTYQAYQRT* P 
-KIQD*RIMC*HTRHTKGHDL 

KFKTEGLCVDI PGIPKDMTY 
18121 - ACCGTAGACTCATCTCTATGATGGGTTTCAAAATGAATTACCAAGTCAATGGTTACCCTA - 18180 
~TVDSSL*WVSK* ITKSMVTL 

- P * THLYDGFQNELPSQWLP* 

RRLI SMMGFKMNYQVNGYPN 
18181 - ATATGTTTATCACCCGCGAAGAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATG - 18240 
-I C LS PAKKLFVT FVRGLALM 
-YVYHPRRSYSSRSCVDWL*C 
MFI TREEAIRHVRAWIGFDV 
18241 - TAGAGGGCTGTCATGCAACTAGAGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGAT - 18300 

- * RAVMQLEMLWVLTYLS S * D 
-RGLSCN*RCCGY*PTSPARI 

EGCHATRDAVGTNLPLQLGF 
18301 - TTTCTACAGGTGTTAACTTAGTAGCTGTACCGACTGGTTATGTTGACACTGAAAATAACA - 18360 
-FLQVLT* *LYRLVMLTLKIT 
-FYRC*LSSCTDWLC*H*K*H 
STGVNLVAVPTGYVDTENNT 
18361 ~ CAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGACCAGTTTAAACATCTTATAC - 18420 
-QNSPELMQNLHQVTSLNILY 
-RIHQS*CKTSTR*PV*TSYT 
EFTRVNAKPPPGDQFKHLIP 
18421 - CACTCATGTATAAAGGCTTGCCCTGGAATGTAGTGCGTATTAAGATAGTACAAATGCTCA - 18480 
-HSCIKACPGM*CVLR*YKCS 
~THV*RLALECSAY* DSTNAQ 
LMYKGLPWNVVRI KIVQMLS 
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18481 - GTGATACACTGAAAGGATTGTCAGftCAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTG - 18540 
-VI H* KDCQTESCSS F G R M A L 
*YTERIVRQSRVRPLGAWL* 
DTLKGLSDRVVFVLWAHGFE 
18541 - AGCTTACATCAATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTG - 18600 
~ S L H Q * STLSRLDLKERVVCV 
-AYINEVLCQDWT*KNVLSV* 
LTSMKYFVKIGPERTCCLCD 
18601 - ACAAACGTGCAACTTGCTTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTG - 18660 
-TWVQLAFLLHQI LMPAGI IL 
-QTCNLLFYFIRYLCLLESFC 
KRATCFSTSSDTYACWNHSV 
18 661 - TGGGTTTTGACTATGTCTATAACCCATTTATGATTGATGTTCAGCAGTGGGGCTTTACGG - 18720 
-WVLTMSITHL*LMFSSGALR 
-GF*LCL*PIYD*CSAVGLYG 
GFDY VYNPFMI DVQQWGFTG 
18721 - GTAACCTTCAGAGTAACCATGACCAACATTGCCAGGTACATGGAAATGCACATGTGGCTA - 18780 

- V T FRVTMTN I ARYMEM HMWL 
-*PSE*P*PTLPGTWKCTCG* 

NLQS NHDQHCQVHGNAHVAS 
187 81 - GTTGTGATGCTATCATGACTAGATGTTTAGCAGTCCATGAGTGCTTTGTTAAGCGCGTTG - 1884 0 
-VVMLS *LDV*QSMSALLSAL 

-l*cyhd*mfssp*vlc*ar* 
cdaimtrclavhecfvkrvd 
18841 - attggtctgttgaataccctattataggagatgaactgagggttaattctgcttgcagaa - 18900 
~igllntll*emn*glillae 
-lvc*ipyyrr*teg*fclqk 
wsveypi igdelrvnsacrk 
18901 - aagtacaacacatggttgtgaagtctgcattgcttgctgataagtttccagttcttcatg - 18960 
-kyntw l * slhclli s fqffm 
-stthgcevciac**vssss* 
vqhmvvksalladkfpvlhd 
18961 - acattggaaatccaaaggctatcaagtgtgtgcctcaggctgaagtagaatggaagttct - 19020 

- T L E I QRLSSVCLRLK*NGSS 
-HWKSKGYQVCASG* SRMEVL 

IGNP KAIKCVPQAEVEWKFY 
19021 - ACGATGCTCAGCCATGTAGTGACAAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATG - 19080 

- T M L S H VVTKLTK* RNS SILM 

- R C S A M * *QSLQNRGTLLFIjC 

DAQPCSDKAYKIEELFYSYA 
19081 - CTACACATCACGATAAATTCACTGATGGTGTTTGTTTGTTTTGGAATTGTAACGTTGATC - 19140 
-LHITINSLMVFVCFGIVTLI 
-YTSR*IH*WCLFVLEL*R*S 
THHDKFTDGVCLFWNCNVDR 
19141 - GTTACCCAGCCAATGCAATTGTGTGTAGGTTTGACACAAGAGTCTTGTCAAACTTGAACT - 19200 
-VTQPMQLCVGLTQESCQT* T 
-LPSQCNCV*V*HKSLVKLEL 
YPANAIVCRFDTRVLSNLNL 
19201 - TACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCACACTCCAGCTT - 192 60 
-YQAVMVVVCM* I SMHS TLQL 
-TRL*WW*FVCE*ACIPHS5F 
PGCDGGSLYVNKHAFHTPAF 
192 61 - TCGATAAAAGTGCATTTACTAATTTAAAGCAATTGGCTTTCTTTTACTATTCTGATAGTC - 19320 
-SIKVHLLI*SNCLSFTILIV 
-R*KCIY*FKAIAFLLLF**S 
DKSAFTNLKQLPFFYYSDSP 
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19321 - CTTGTGAGTCTCATGGCAAACAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTG - 19380 
-LVSLMANK* CRILIMFHSNL 
-L*VSWQTSSVGY*LCSTQIC 
CESHGKQVVSDI DYVPLKSA 
19381 - CTACGTGTATTACACGATGCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGT - 19440 
-LRVLHDAI *VVLFADTMQMS 
-YVYYTMQFRWCCLQTPCK*V 
TCITRCNLGGAVCRHHANEY 
19441 - ACCGACAGTACTTGGATGCATATAATATGATGATTTCTGCTGGATTTAGCCTATGGATTT - 19500 
-TDSTWMH I I * *• FLL DLAYGF 
-PTVLGCI*YDDFCWI*PMDL 
RQYLDAYNMMISAGFSLWIY 
19501 - ACAAACAATTTGATACTTATAACCTGTGGAATACATTTACCAGGTTACAGAGTTTAGAAA - 195 60 
-TNNLI LI TCGIHLPGYRV*K 
-QTI * Y L * PVEYIYQVTEFRK 
KQFDTYNLWNTFTRLQSLEN 
19561 - ATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGGACACGCCGGCGAAGCACCTG - 19620 
-MWLIMLL IKDTLMDTPAKHL 
-CGL*CC**RTL*WTRRRSTC 
VAYNVVNKGHFDGHAGEAPV 
19621 - TTTCCATCATTAATAATGCTGTTTACACAAAGGTAGATGGTATTGATGTGGAGATCTTTG - 19680 

- F P S L IML FTQR*MVLMWRSL 
-FHH**CCLHKGRWY*CGDL* 

SIINNAVYTKVDGI DVEIFE 
19681 - AAAATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTA - 1974 0 
-KIRQHFLLMLHLSFGLSVTL 
-K*DNTSC*CCI*ALG*A*H* 
NKTTLPVNVAFELWAKRNIK 
19741 - AACCAGTGCCAGAGATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTG - 19800 
-NQCQRLRYS IIWVL I SLLIL 
-- T S A R D * D T Q ^ FGC*YRC*YC 
PVPE IKI LNNLGVDIAANTV 
19801 - TAATCTGGGACTACAAAAGAGAAGCCCCAGCACATGTATCTACAATAGGTGTCTGCACAA - 198 60 

- * SGTTKEKPQHMYLQ* VSAQ 
-NLGLQKRSPSTCIYNRCLHN 

I W DYKREAPAHVST IGVCTM 
19861 - TGACTGACATTGCCAAGAAACCTACTGAGAGTGCTTGTTCTTCACTTACTGTCTTGTTTG - 19920 
-*LTI)PRNIjLRVLVLHLLSCL 
-D*HCQETY*ECLFFTYCLV* 
TDIAKKPTESACSSLTVLFD 
19921 - ATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAATGGTGTTTTAATAA - 19980 
-MVEWKDR* TFLETPVMVF* * 
-W*SGRTGRPF*KRP*WCFNN 
GRVEGQVDLF RNARNGVLIT 
19981 ~ CAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCAATG - 20040 
-QKVQ S KV * HLQRDQHKLASM 
-RRFSQRSNTFKGTSTS * R Q W 
EGSVKGLTPSKGPAQASVNG 
20041 - GAGTCACATTAATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACG - 20100 
-ESH*LENQ*KHSLTTLRK*T 

- SHINWRISKNTV*LL*ESRR 

VTLIGESVKTQFNYFKKVDG 
20101 - GCATTATTCAACAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTA - 20160 
-ALFNSCLKPTLLRAET * R I L 
~HYSTVA*NLLYSEQRLRGF* 

I IQQLPETYFTQSRDLEDFK 
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20161 - AGCCCAGATCACAAATGGAAACTGACTTTCTCGAGCTCGCTATGGATGAATTCATACAGC - 20220 
-SPDHKWKLTFSS SLWMNSYS 

- AQITNGN*LSRARYG* IHTA 

PRSQMETDFLELAMDEFIQR 
20221 - GATATAAGCTCGAGGGCTATGCCTTCGAACACATCGTTTATGGAGATTTCAGTCATGGAC - 20280 
-DISSRAMPSNTS FMEISVMD 
-I *ARGLCLRTHRI»WRFQSWT 
YKLEGYAFEHIVYGDFSHGQ 
20281 - AACTTGGCGGTCTTCATTTAATGATAGGCTTAGCCAAGCGCTCACAAGATTCACCACTTA - 20340 
-NLAVFI* * *A*PSAHKIHHL 
-TWRSSFNDRLSQALTRFTT* 
LGGLHLMIGLAKRSQDSPLK 
20341 - AATTAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAACAGATGCGC - 20400 
-N*RILSLWTAQ*KITS*QMR 

- IRGFYPYGQHSEKLLHNRCA 

LEDFIPMDST VKNYFITDAQ 
20401 - AAACAGGTTCATCAAAATGTGTGTGTTCTGTGATTGATCTTTTACTTGATGACTTTGTCG - 204 60 
-KQVHQNVCVL* LI FYLMTLS 
~NRFIKMCVFCD*SFT**LCR 
TGSSKCVCSVI DLLLDDFVE 
20461 - AGATAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGACT - 20520 
-R±*SHKICQ*FQKWSRLQLT 
-DNKVTRFVSDFKSGQGYN*L 
IIKSQDLSVISKVVKVTIDY 
20521 - ATGCTGAAATTTCATTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAA - 20580 
-MLKFHSCFGVRMDMLKP STQ 
-C*NFIHALV*GWTC*NLLFK 
AEISFMLWCKDGHVETFYPK 
20581 - AACTACAAGCAAGTCAAGCGTGGCAACCAGGTGTTGCGATGCCTAACTTGTACAAGATGC - 20640 
-NYKQVKRGNQVLRCLTCTRC 
-TTSKSSVATRCCDA*LVQDA 
LQASQAWQPGVAMPNLYKMQ 
20641 - AAAGAATGCTTCTTGAAAAGTGTGACCTTCAGAATTATGGTGAAAATGCTGTTATACCAA - 20700 
-KECFLKSV TFRIMVKMLLYQ 
-KNAS*KV* PSELW*KCCYTK 
RMLLEKCDLQNYGENA.VIPK 
20701 - AAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATACTTAAATACACTTA - 207 60 

- K E * * *MSQSILNCVNT* IHL 
-RNNDECRKVYSTVSILKYTY 

GIMMNVAKYTQLCQYLNTLT 
20761 - CTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGGTGCTGGCTCTGATAAAGGAG - 20820 
-L^LYPTT* ELFTLVLALIKE 
-FSCTLQHESYSLWCWL**RS 
LAVPYNMRVIHFGAGSDKGV 
20821 - TTGGACCAGGTACAGCTGTGGTCAGACAATGGTTGGCAACTGGCACACTACTTGTCGATT - 208 80 
-LHQVQLCS DNGCQLAHYLS I 
-CTRYSCAQTMVANWHTTCRF 
APGTAVLRQWLPTGTLLVDS 
20881 - CAGATCTTAATGACTTCGTCTCCGACGCAGATTCXACTTTAATTGGAGACTGTGCAACAG - 20940 
-QI LMT SSPTQILL* LETVQQ 

- R S * * LRLRRRFYFNWRLCNS 

DLNDFVS DADSTLIGDCATV' 
20941 - TACATACGGCTAATAAATGGGACCTTATTATTAGCGATATGTATGACCCTAGGACCAAAC - 21000 
-YIRLINGTLLLAICMTLGPN 

- T Y G * *MGPYY*RYV*P*DQT 

HTANKWDLI I SDMYDPRTKH 
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21001 - ATGTGACAAAAGAGAATGACTCTAAAGAAGGGTTTTTCACTTATCTGTGTGGATTTATAA - 21060 

- M * QKRMTLKKGFSLICVDL* 
-CDKRE*L*RRVFHLSVWIYK 

VTKENDSKEGFFTYLCGFIK 
21061 - AGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAACAGAGCATTCTTGGAATG - 21120 
-SKN*PWVVL*L*R*QSILGM 

- AKTS PGWFYSCKDNRAFLEC 

QKLALGGS I A V K I TEHSWNA 
21121 - CTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGACAGCTTTTGTTACAAATGTAA - 21180 
-LTFTSLWAISHGGQLLLQM* 
* PLQAYGPFLMVDSFCYKCK 
DLYKLMGHFSWWTAFVTNVN 
21181 - ATGCATCATCATCGGAAGCATTTTTAATTGGGGCTAACTATCTTGGCAAGCCGAAGGAAC - 2124 0 
-MHHHRKHF*LGLT ILASRRN 
-C1IIGSIFNWG*LSWQAEGT 
ASSSEAFLIGANYLGKPKEQ 
21241 - AAATTGATGGCTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCC - 21300 

- K L M A I PCMLTTFSGGTQILS 

- N * WLYHAC* LHFLEEHKSYP 

I D'GYTMHANYI FWRNTNPIQ 
21301 - AGTTGTCTTCCTATTCACTCTTTGACATGAGCAAATTTCCTCTTAAATTAAGAGGAACTG - 21360 
-SCLPIHSLT*ANFLLN*EEL 
-VVFLFTL*HEQISS* IKRNC 
LSSYSLFDMSKFPLKLRGTA 
21361 - CTGTAATGTCTCTTAAGGAGAATCAAATCAATGATATGATTTATTCTCTTCTGGAAAAAG - 21420 

- L * CLLRRI KSMI * FILFWKK 

- C N V S * G E S N Q * YDLFSSGKR 

VMSLKENQINDMI YSLLEKG 
21421 - GTAGGCTTATCATTAGAGAAAACAACAGAGTTGTGGTTTCAAGTGATATTCTTGTTAACA - 21480 
-VGLSLEKTTELWFQVIFLLT 
-*AYH*RKQQSCGFK*YSC*Q 
RLIIRENNRVVVSSDILVNN 
21481 - ACTAAACGAACATGTTTATTTTGTTATTATTTCTTACTCTCACTAGTGGTAGTGACCTTG - 21540 
-TKRTCLFS YYFLLS LVVVTL 
-LNEHVYFLIISYSH*W**P* 
* TNMFIFLLFLTLTSGSDLD 
21541 - ACCGGTGCACCACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCTA - 21600 
-TGAPLLMMFKLL I TLN I LHL 
-PVHHF**CSSS*LHSTYFIY 
RCTTFDDVQAPNYTQHTSSM 
21601 - TGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATTTAACTCAGG - 21660 
-*GGFTILMKFLDQTLFI*LR 
-EGGLLS**:NF* IRHSLFNSG 
RGVYYPDEIFRSDTLYLTQD 
21661 - ATTTATTTCTTCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTG - 21720 
-IYFFHFILMLQGFI LLI IRL 

- F I S S ILF*CYRVSYY*SYVW 

LFLPFYSNVTGFHTINHTFG 
21721 - GCAACCCTGTCATACCTTTTAAGGATGGTATTTATTTTGCTGCCACAGAGAAATCAAATG - 21780 
-AT LSYLLRMVFI LLPQRNQM 

- QPCHTF^GWYLFCCHREIKC 

NPVI PFKDGI YFAATEKSNV 
21781 - TTGTCCGTGGTTGGGTTTTTGGTTCTACCATGAAGAACAAGTCACAGTCGGTGATTATTA - 218 4 0 
-LSVVGFLVLP * TTSHSR* LL 
-CPWLGFWFYHEQQVTVGDYY 
VRGWVFGSTMNNKSQSVIII 
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21841 - TTAACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGAATTGTGTGACAACCCTT - 21900 
-LTI LLMLLYEHVTLNCVTTL 
-*QFY*CCYTSM*L*IV*QPF 
NNSTNVVIRACNFELCDNPF 

21901 - TCTTTGCTGTTTGTAAACCCATGGGTACACAGACACATACTATGATATTCGATAATGCAT - 21960 
-SLLFLNPWVHRH I L * Y S I M H 
-LCCF *THGYTDTYYDIR*CI 
FAVSKPMGTQTHTMIFDNAF 

21961 - TTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGAAAAGTCAG - 22 020 
-LIALS STYLMPFRLMFQKSQ 

- *LHFRVHI*CLFA*CFRKVR 

NCTFEYISDAFSLDVSEKSG 
22021 - GTAATTTTAAACACTTACGAGAGTTTGTGTTTAAAAATAAAGATGGGTTTCTCTATGTTT - 22080 
-VILNT YESLCLKIKMGFSMF 
-*F*TLTRVCV*K*RWVSLCL 
NFKHLREFVFKNKDGFLYVY 
22 081 - ATAAGGGCTATCAACCTATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGA - 22140 
~IRAINL*M*FVIYLLVLTL* 
"*GLSTYRCSS'*STFWF*HFE 
KGYQPIDVVRDLPSGFNTLK 
22141 - AACCTATTTTTAAGTTGCCTCTTGGTATTAACATTACAAATTTTAGAGCCATTCTTACAG - 222 00 
-NLFLS CLLVLTLQI LEPFLQ 
-TYF*VASWY*HYKF*SHSYS 
PIFKLPLGINITNFRAILTA 
22201 - CCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCTGCAGCCTATTTTGTTGGCTATT - 222 60 
-PFHLLKTFGARQLQ PILLAI 
-LFTCSRHLGHVSCSLFCWLF 
FSPAQDIWGTSAAAYFVGYL 
22261 - TAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGGTACAATCACAGATGCTGTTG - 22320 

- * SQLHLCSSMMKMVQSQMLL 

- K A N Y I Y A Q V * *KWYNHRCC* 

KPTTFMLKYDENGTITDAVD 
22321 - ATTGTTCTCAAAATCCACTTGCTGAACTCAAATGCTCTGXTAAGAGCTTTGAGATTGACA - 22380 

- I V L K I HLLNSNALLRALRLT 
-LFSKSTC*TQMLC*EL*D*Q 

CSQN PLAELKCSVKSFEIDK 
22381 - AAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCC - 22440 
-KEFTRPLISGLFPQEML*DS 
-RNLP D L * FQGCSLRRCCEIP 
GIYQTSNFRVVPSGDVVRFP 
22441 - CTAATATTACAAACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTG - 22500 
-LI LQTCVLLERFLMLLNSLL 
-*YYKLVSFWRGF*CY*IPFC 
NITNLCPFGEVFNATKFPSV 
22501 - TCTATGCATGGGAGAGAAAAAAAATTTCTAATTGTGTTGCTGATTACTCTGTGCTCTACA - 22560 
-SMHGREKKFLIVLLITLCST 
-LCMGEKKNF*LCC*LLCALQ 
YAWERKKISNCVADYSVLYN 
22561 - ACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTTTCTGCCACTAAGTTGAATGATC - 22620 
-TQHFFQPLSAMAFL PLS * M I 
-LNIFFNL*VLWRFCH*VE*S 
STFFSTFKCYGVSATKLNDL 
22621 - TTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGAGATGATGTAAGACAAA - 22680 
-FASPMSMQILL* SREMM*DK 
~ L LLQCLCRFFCSQGR* CKTN 
CFSNVYADSFVVKGDDVRQI 
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22681 - TAGCGCCAGGACAAACTGGTGTTATTGCTGATTATAATTATAAATTGCCAGATGATTTCA - 221 AO 
-*RQDKLVLLLII IINCQMIS 
-SARTNWCYC*L*L*IAR*FH 
APGQTGVIADYNYKLPDDFM 
22741 - TGGGTTGTGTCCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTATA - 22800 
-WVVSLLGILGTLMLLQLVI I 
-GLCPCLEY*EH*CYFNW*I ) * 
GCVLAWNTRNIDATSTGNYN 
22801 - ATTATAAATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGAGACATATCTA - 22860 
-IINIGILDMASLGPLRETYL 
-L*I*VS*TWQA*AL*ERHI* 
YKYRYLRHGKLRPFERDISN 
22861 - ATGTGCCTTTCTCCCCTGATGGCAAACCTTGCACCCCACCTGCTCTTAATTGTTATTGGC - 22920 
-MCLSPLMANLAPHLLLIV IG 
-CAFLP*WQTLHPTCS*LLLA 
VPFSPDGKPCTPPALNCYWP 
22921 - CATTAAATGATTATGGTTTTTACACCACTACTGGCATTGGCTACCAACCTTACAGAGTTG - 22980 

- H * MIMVFTPLLALATNLTEL 

- IK^LWFLHHYWHWLPTLQSC 

LNDYGFYTTTGIGYQPYRVV 
22981 ~ TAGTACTTTCTTTTGAACTTTTAAATGCACCGGCCACGGTTTGTGGACCAAAATTATCCA - 23040 

- * YFLLNF*MHRPRFVDQNYP 
-STFF*TFKCTGHGLWTKI XH 

VLSFELLNAPATVCGPKLST 
23041 - CTGACCTTATTAAGAACCAGTGTGTCAATTTTAATTTTAATGGACTCACTGGTACTGGTG - 23100 
-LTLLRTSVS ILI LMDSLVLV 
-*PY*EPVCQF*F*WTHWYWC 
DLIKNQCVNFNFNGLTGTGV 
23101 - TGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCAACAATTTGGCCGTGATGTTTCTG - 23160 

- C * LLLQRDFNHFNNLAVMFL 

- VNSFFKEISTISTIWP*CF* 

LTPS SKRFQPFQQFGRDVSD 
23161 - ATTTCACTGATTCCGTTCGAGATCCTAAAACATCTGAAATATTAGACATTTCACCTTGCT - 23220 

- I S L I P F E ILKHLKY * TFHLA 

- F H * FRSRS*NI*NIRHFTLL 

FTDSVRDPKTSEILDISPCS 
23221 - CTTTTGGGGGTGTAAGTGTAATTACACCTGGAACAAATGCTTCATCTGAAGTTGCTGTTC - 23280 
-LLGV*V* LHLEQMLHLKLLF 
-FWGCKCNYTWNKCFI*SCCS 
FGGV SVITPGTNAS SEVAVL 
23281 - TATATCAAGATGTTAACTGCACTGATGTTTGTACAGCAATTCATGCAGATCAACTCACAC - 23340 
-Y I KMLTALMFLQQFMQ INSH 
-ISRC*LH*CFYSNSCRSTHT 
YQDVNCTDVSTAIHADQLTP 
23341 - CAGCTTGGCGCATATATTCTACTGGAAACAATGTATTCCAGACTCAAGCAGGCTGTCTTA - 23400 
-QLGAY I LLETMYSRLKQAVL 

- S L A H I FYWKQC1PDSSRLSY 

AWRIYSTG1SINVFQTQAGCLI 
23401 - TAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTATTGGAGCTGGCATTT - 23460 
-*ELSMSTLLMSATFLLELAF 

- RS*ACRHFL*VRHSYWSWHL 

GAEHVDTSYECDIP IGAGIC 
23461 - GTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAGCCAAAAATCTATTGTGGCTT - 23520 
-VLVT I Q FLYYVVLAKNLLWL 
_e*LPYSFFIT*Y*PKIYCGL 

ASYH TVSLLRSTSQKS IVAY 
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23521 - ATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATAC - 23580 
-I L C L * V L I VQLIiTLITPLLY 

- YYVFRC** F N C L L * * H H C Y T 

TMSLGAD S SIAYSNNTIAIP 
23581 - CTACTAACTTTTCAATTAGCATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCT - 23640 
-LLTFQLALLQK* CLFLWLKP 

- Y*LFN*HYYRSNACFYG*NL 

TNFSI SITTEVMPVSMAKTS 
23641 - CCGTAGATTGTAATATGTACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCC - 23700 
-P*IVICTSAEILLNVLICFS 
-RRL*YVHLRRFY*MC* FASP 
VDCNMYICGDSTECANLLLQ 
23701 - AATATGGTAGCTTTTGCACACAACTAAATCGTGCACTCTCAGGTATTGCTGCTGAACAGG - 23760 
-NMVAFAHN* IVHSQVLLLNR 
IW*LLHTTKSCTLRYCC* TG 
YGS FCTQLNRALSGIAAEQD 
237 61 - ATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAATGTACAAAACCCCAACTTTGA - 238 2 0 
-IATHVKCSLKSNKCTKPQL* 
-SQHT*SVRSSQTNVQNPNFE 
RNTREVFAQVKQMYKTPTLK 
23821 - AATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAAGCCAACTAAGA - 23880 

- N ILVVLI FHKYYLTL* S Q L R 
-IFWWF*FFTNIT*PSKAN*E 

YFGGFNFSQILPDPLKPTKR 
23881 - GGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGA - 23940 
-GLLLRTCSLI R * HS LMLAS * 
-VFY*GLAL**GDTR*CWLHE 

SFIEDLLFNKVTLADAGFMK 

23 941 - AGCAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGT - 24000 

-SNMANA^VILMLEISFVRRS 
-AIWRMPR*Y*C*RSHLCAEV 
QYGECLGDINARDLICAQKF 

24 001 - TCAATGGACTTACAGTGTTGCCACCTCTGCTCACTGATGATATGATTGCTGCCTACACTG - 24060 

-SMDLQCCHLCSLMI * L L P T L 
-QWTYSVATSAH* *YDCCLHC 
NGLTVLPPLLTDDMIAAYTA 
24061 - CTGCTCTAGTTAGTGGTACTGCCACTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTC - 24120 

- L L * LVVLPLLDGHLVLALLF 
-CSS*WYCHCWMDIWCWRCSS 

ALVSGTATAGWTFGAGAALQ 
24121 - AAATACCTTTTGCTATGCAAATGGCATATAGGTTCAATGGCATTGGAGTTACCCAAAATG - 2 41 BO 
-KYLLLCKW H I GSMALELPKM 
-NTFCYANG I *VQWHWSYPKC 
IPFAMQMAYR FNGI GVTQNV 
24181 - TTCTGTATGAGAACCAAAAACAAATCGCCAACCAATTTAACAAGGCGATTAGTCAAATTC - 2 4240 
-FSMRTKNKSPTNLTRRLVKF 

- S L * . E PKTN RQP I * QGD* SNS 

LYENQKQIANQFNKAISQIQ 
24 241 - AAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGTTGTTAACCAGA - 24300 
-KNHLQQHQLHWASCKTliLTR 

- RITYNNINCIGQAARRC* PE 

ESLTTTSTALGKLQDVVNQN 
24 301 - ATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGTGCAATTTCAA - 24360 
-MLKH* THLLNNLALILVQFQ 
-CSSIKHTC*TT*L*FWCNFK 

AQALNTLVKQLSSNFGAISS 
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24361 - GTGTGCTAAATGATATCCTTTCGCGACTTGATAAAGTCGAGGCGGAGGTACAAATTGAC^ - 24420 
~VC*MISFRDLIKSRRRYKLT 
-CAK*YPFAT**SRGGGTN*Q 
VLNDILSRLDKVEAEVQXDR 

24 421 - GGTTAATTACAGGCAGACTTCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGG - 24480 
-G*LQADFKAFKPM*HNN*SG 
-VNYRQTSKPSNLCNTTTNQG 
LITGRLQSLQTYVTQQLIRA 

24 481 - CTGCTGAAATCAGGGCTTCTGCTAATCTTGCTGCTACTAAAATGTCTGAGTGTGTTCTTG - 2 4540 
-LLKSGLLLILLLLKCLSVFL 

- C * N Q G F C * SCCY*NV*VCSW 

AEIRASANLAATKMSECVLG 
24541 - GACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCACCTTATGTCCTTCCCACAAG - 24600 
-DNQKELTFVERATTLCPSHK 
-TIKKS*IjLWKGLPPYVLPTS 
QSKRVDFCGKGYHLMSFPQA 
24 601 - CAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAGGAGAGGAACT - 24 660 
-QPRMVLSSYMSRMCHPRRGT 
-SPAWCCLPTCHVCAIPGEEL 
APHGVVFLHVTYVPSQBRNF 
24 661 - TCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTT - 24720 
-SPQRQQFVMKAKHTSLVKVF 
-HHSASNLS*RQSIIiPS*RCF 
TTAPAICHEGKAYFPREGVF 
24721 - TTGTGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAACTTCTTTTCTCCACAAATAA - 24780 
-LCLMALLGLLHRGTSFLHK* 
-CV*WHFLVYYTEELLFSTNN 
VFNGTSWFITQRNFFSPQII 
24781 - TTACTACAGACAATACATTTGTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACA - 248 40 
-LLQTI HLSQE IVMSLLASLT 
-YYRQYXCLRKL*CRYWHH*Q 
TTDNTFVSGNCDVVIGI INN 
24841 ~ ACACAGTTTATGATCCTCTGCAACCTGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGT - 24 900 
-TQFMI LCNLSLTHSKKSWTS 
-HSL*SSAT*A*LIQRRAGQV 
TVYDPLQPELDS FKEELDKY 
24901 - ACT1CAAAAATCATACATCACCAGATGTTGATCTTGGCGACATTTCAGGCATTAACGCTT - 24960 

- T S K I I HHQMLILAT FQALTL 
-LQKSYITRC*SWRHFRH*RF 

FKNHTSPDVDLGDISGINAS 
24 961 - CTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTCGCTAAAAATTTAAATG - 25020 
-LS STFKKKIiTASMRSLKI * M 
-CRQHSKRN* PPQ*GR*KFK* 
VVNIQKEIDRLNEVAKNLNE 
25021 - AATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTGGT - 2508 0 
-NHS LTFKNWENMSN ILNGLG 
-ITH*PSRIGKI*AIY*MALV 
SLIDLQELGKYEQYIKWPWY 
25081 - ATGTTTGGCTCGGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTT - 2514 0 
-MFGSASLLD* LPSSWLQSCF 
-CLARLHCWTNCHRHGYNLAL 
VWLGFIAGLIAIVMVTILLC 
25141 - GTTGCATGACTAGTTGTTGCAGTTGCCTGAAGGGTGCATGCTCTTGTGGTTCTTGCTGCA - 25200 

- V A * LVVAVASRVHALVVLAA 

- L H D * LLQLPQGCMLLWFLLQ 

CMTSCCSCLKGACSCGSCCK 
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25201 - AGTTTGATGAGGATGACTCTGAGCCAGTTCTCAAGGGTGTCAAATTACATTACACATAAA - 25260 
-SLMRMTLSQFSRVSNYITHK 
~V**G*L*ASSQGCQITLHIN 
FDEDDSEPVLKGVKLHYT*T 
25261 - CGAACTTATGGATTTGTTTATGAGATTTTTTACTCTTGGATCAATTACTGCACAGCCAGT - 25320 
-RTYGFVYEIFYSWINYCTAS 
-ELMDLFMRFFTLGS ITAQPV 
NLW ICL*DFLLLDQLLHSQ* 
25321 - AAAAATTGACAATGCTTCTCCTGCAAGTACTGTTCATGCTACAGCAACGATACCGCTACA - 25380 

- K N * QCFSCKYCSCYSNDTAT 
-KIDNASPASTVHATATIPLQ 

KLTMLLLQVLFMLQQRYRYK 
25381 - AGCCTCACTCCCTTTCGGATGGCTTGTTATTGGCGTTGCATTTCTTGCTGTTTTTCAGAG - 254 40 
-SLTPFRMACYWRCISCCFSE 
-ASLP FGWLVIGVAFLAVFQS 
PHSLSDGLLLALHFLLFFRA 
25441 - CGCTACCAAAATAATTGCGCTCAATA71AAGATGGCAGCTAGCCCTTTATAAGGGCTTCCA - 25500 
-RYQNNCAQ*KMAASPL*GLP 

- ATKI IALNKRWQLALYKGFQ 

LPK*LRSIKDGS*PFIRASS 
25501 - GTTCATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGC - 255 60 
-VHLQ FTAAICYHLFT SFACR 

- FI CN LLLLFVT IYSHLLLVA 

SFAIYCCYLLPSIHIFCLSL 
255 61 - TGCAGGTAAGGAGGCGCAATTTTTGTACCTCTATGCCTTGATATATTTTCTACAATGCAT - 2 5620 
-CR*GGAIFVPLCLD.IFSTMH 
-AGKEAQFLYLYALIYFLQCI 
QVRRRNFCTSMP* YI FYNAS 
25621 - CAACGCATGTAGAATTATTATGAGATGTTGGCTTTGTTGGAAGTGCAAATCCAAGAACCC - 25680 
-QRM*NYYEMLALLEVQIQEP 
-NACRIIMRCWLCWKCKSKNP 
THVELL* DVGFVGSANPRTH 
25681 - ATTACTTTATGATGCCAACTACTTTGTTTGCTGGCACACACATAACTATGACTACTGTAT - 25740 

- I T L * CQLLCLLAHT * L * LLY 
-LLY DANYFVCWHTHNY DYCI 

YFMMPTTLFAGTHITMTTVY 
25741 - ACCATATAACAGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGGCATTTCAACACC - 25800 
-TI * QCHRYNCRY*R* RHFNT 
-PYNSVTDTIVVTEGDGISTP 
HITVSQIQLSLLKVTAFQHQ 
25 801 - AAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACTCAGGTGTTAA - 258 60 
-KTQRRLPNWWLF*G*ALRC* 
-KLKE DYQIGGYSE DRHSGVK 
NSKKTTKL VVILRIGTQVLK 
25861 - AGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACA - 25920 
-RLCRCTWLFHRSLLPA*VYT 

- DYVVVHGYFTEVYYQLESTQ 

TMSLYMAISPKFTTSLSLHK 
25921 - AATTACTACAGACACTGGTATTGAAAATGCTACATTCTTCATCTTTAACAAGCTTGTTAA - 25980 
-NYYRHWY*KCYI L H L * Q A C * 

- ITTDTGIENATFFIFNKLVK 

LLQTLVLKMLHS SSLTSLLK 
25 981 ~ AGACCCACCGAATGTGCAAATACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGC - 2604 0 
-RPTECANTHNRRLFRSC* SS 
-DPPNVQIHTIDGS SGVANPA 

THRMCKYTQSTALQELLIQQ 
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26041 - AATGGATCCAATTTATGATGAGCCGACGACGACTACTAGCGTGCCTTTGTAAGCACAAGA - 26100 
-NGSNL* *ADDDY*RAFVSTR 
-MDPIYDEPTTTTSVPL*AQE 
WIQFMMSRRRLLACLCKHKK 

26101 - AAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACAGGTACGTTAATAGTTAA - 26160 
-K*VRTYVLIRFGRNRYVNS* 

- SEYELMYSFVSBBTGTLIVN 

VSTNLCTHSFRKKQVR* * L I 
26161 - TAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCATCCTTAC - 26220 

- * RTS FSCFRGILAS HTSHPY 
-SVLLFLAFVVFLLVTLAILT 

AYFFFLLSWYSC*SH*PSIiL 
2 6221 - TGCGCTTCGATTGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTTTAGTAAAACCAAC - 2 6280 
-CAS IVCVLLQYC^REFS KTN 
-ALRLCAYCCNIVNVSLVKPT 
RFDCVRTAAILLT*V**NQR 
26281 - GGTTTACGTCTACTCGCGTGTTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCT - 26340 
-GLRLLAC*-KSELF* R S S * SS 
-VYVYSRVKNLNSSBGVPDLL 
FTSTRVLKI ^TLLKEFLIFW 
26341 - GGTCTAAACGAACTAACTATTATTATTATTCTGTTTGGAACTTTAACATTGGTTATCATG - 2 64 00 
-GLNELTI I I ILFGT LTLLIM 
-V*TN*LLLLFCLEL*HCLSW 
SKRTNYYYYSVWNFNIAYHG 
26401 - GCAGACAACGGTACTATTACCGTTGAGGAGCTTAAACAACTCCTGGAACAATGGAACCTA - 26460 
-ADNGT ITVEELKQLLEQWNL 
-QTTVLLPLRSLNNSWNNGT* 
RQRYYYR*GA*TT PGTMEPS 
26461 - GTAATAGGTTTCCTATTCCTAGCCTGGATTATGTTACTACAATTTGCCTATTCTAATCGG - 26520 
-VIGFLFLAWIMLLQFAYSNR 
-**VSYS*PGLCYYNLPILIG 
NRFPIPSLDYVTTICXiF*SE 
26521 - AACAGGTTTTTGTACATAATAAAGCTTGTTTTCCTCTGGCTCTTGTGGCCAGTAACACTT - 26580 

- N R F L Y IIKLVFLWLLWPVTL 

- T G F C T * *SLFSSGSCGQ*HL 

QVFVHNKACFPLALVASNTC 
26581 - GCTTGTTTTGTGCTTGCTGTTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT - 2 6640 
-ACFVLAVVYRINWVTGGIAI 
-LVLCLLLSTELIG*LAGLRL 
LFCACCCLQN* LGDWRDCDC 
26641 - GCAATGGCTTGTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTG - 2 6700 
-AMACI VGLMWLSYFVASFRL 
-QWLVL*A*CGLATSLLPSGC 
NGLYCRLDVA*LLRCFLQAV 
26701 - TTTGCTCGTACCCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCTTCTCAATGTG - 26760 
-FARTRSMWSFWPETNILLNV 
-LLVPAQCGHSTQKQTFFSMC 
CSYPLNVVIQPRNKHSSQCA 
26761 - CCTCTCCGGGGGACAATTGTGACCAGACCGCTCATGGAAAGTGAACTTGTCATTGGTGCT - 2 6820 
-PLRGT IVTRPLMESELVIGA 
-LSGGQL* PDRS-WKVNLSLVL 
SPGDNCDQTAHGK*TCHWCC 
26821 - GTGATCATTCGTGGTCACTTGCGAATGGCCGGACACTCCCTAGGGCGCTGTGACATTAAG - 26880 
-V I I RGHLRMAGHSLGRC DIK 
* S FVVTCEWPDTP* GAVTLR 
DHSWSLANGRTLPRAL*H*G 
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26881 - GACCTGCCAAAAGAGATCACTGTGGCTACATCACGAACGCTTTCTTATTACAAATTAGGA - 2694 0 
-DLPKE I TVATSRTLSYYKLG 
-TCQKRSLWLHHERFLITW*E 
PAKRDHCGYITNAFLLQIRS 
26941 - GCGTCGCAGCGTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTATTGGA - 27000 
-ASQRVGTDSGFAAYNRYRI G 
-RRSV*ALIQVLLHTTATVLE 
V A A C R H * FRFCCIQPLPYWK 
27001 - AACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTTGCTAGTACAG - 27060 
-NYKLNTDHAGSNDN IALLVQ 
-TIN* IQTTPVATTILLC*YS 
L * IKYRPRR*QRQYCFASTV 
27061 - TAAGTGACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGAT - 27120 
-*VTTDVSSC*LPGYNSRDI D 
-K*QQMFHLVDFQVTIAEILI 
SDNRCFILLTSRLQ*QRY*L 
27121 - TATCATTATGAGGACTTTCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAAT - 27180 

- Y H Y E DFQDCYLES * RYNKFN 
-IIMRTFRIAIWNLDVIISSI 

SL*GLSGLLFGILTL* * V Q * 
27181 - AGTGAGACAATTATTTAAGCCTCTAACTAAGAAGAATTATTCGGAGTTAGATGATGAAGA - 27240 
-SETII*ASN*EELFGVR**R 
-VRQLFKPLTKKNYSELDDEE 
*DNYLSL*LRRIIRS*MMKN 
27241 - ACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAATTATTCTCTTCCTGACATTGA - 27300 
-TYGVRLS I KRT*KLFSS * H * 
-PMELDYP*NEHENYSLPDID 
LWS* I IHKTNMKI ILFLTLI 
27301 - TTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGTACGACTGTAC - 27360 
-LYLHLASY ITIRSVLEVRLY 
-CIYILRAISLSGVC*RYDCT 
VFTSCELYHYQECVRGTTVL 
27361 - TACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAATTCACCATTTCACCCTCTTG - 27420 
-Y*KNLAHQEHTRAI HHFTLL 
-TKRTLPIRNIRGQFTISPSC 
LKEPCPSGXYEGNSPFHPLA 
27421 - CTGACAATAAATTTGCACTAACTTGGACTAGCACACACTTTGCTTTTGCTTGTGCTGACG - 27480 
-LTINLH*LALAHTLLLLVLT 
-*Q*ICTNLH*HTLCFCLC*R 
DNKFALTCTSTHFAFACADG 
27481 - GTACTCGACATACCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGAC - 27540 

- V L D I P I SCVQDQFHQNFSS D 
-YSTYLSAACKISFTKTFHQT 

TRHTYQLRARSVSPKLFIRQ 
27541 - AAGAGGAGGTTCAACAAGAGCTCTACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTAT - 27600 
-KRRFNKS S TRHFFSLLLL* Y 
-RGGSTRALLATFSHCCCSSI 
EEVQQELYSPLFLIVAALVF 
27601 - TTTTAATACTTTGCTTCACCATTAAGAGAAAGACAGAATGAATGAGCTCACTTTAATTGA - 27660 
-F*YFASPLRERQNE * A H F N * 
-FNTLLHH*EKDRMNELTLID 
LILCFTIKRKTE*MSSL*LT 
27661 - CTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTTTTAATAATGCTTATTATATT - 27720 
-LLFVLFSLSAI PCFNNAYY I 

- FYLCFLAFLLFLVLIMLI IF 

SICAF*PFCYSLF**CLLYF 
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27721 - TTGGTTTTCACTCGAAATCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACGAACAT - 27780 
-LVFTRNPGSRRTLYQSLNEH 
-WFSLEIQDLEEPCTKV*TNM 
GFHSKSRI*KNLVPKSKRT* 
277 81 - GAAACTTCTCATTGTTTTGACTTGXATTTCTCTATGCAGTTGCATATGCACTGTAGTACA - 278 40 
-ETSHCFDLYFSMQLHMHCST 
-KLLIVLTCISLCSCICTVVQ 
NFSLF*LVFLYAVAYAL*YS 
27 841 - GCGCTGTGCATCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGG - 27900 

- A L C I * *TSCA*RSL*GTTLG 
-RCASNKPHVLEDPCKVQH*G 

AVHLINLMCLKILVRYNTRG 

27 901 - GTAATACTTATAGCACTGCTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGAT - 27960 

-VI LIALLGFVL* ERFYLFI D 

- * Y L * H.CI)ALCSRKGFTFS*M 

NTYSTAWLCALGKVLPFHRW 
27961 - GGCACACTATGGTTCAAACATGCACACCTAATGTTACTATCAACTGTCAAGATCCAGCTG - 28020 
-GTLWFKHAHLMLLS TVKIQL 
-AHYGSNMHT*CYYQLSRSSW 
HTMVQTCTPNVTINCQDPAG 
28021 - GTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGGTCACCAAACTGCTGCATTTA - 28080 
-VVRL* LGVGT FMKV TKLLHL 
-WCAYS*VLVPS*RSPNCCI* 
GALIARCWYLHEGHQTAAFR 

28 081 - GAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAATGGACCCCAA - 28140 

-ETYLLF* I NEQI KMSD"NGPQ 
-RRTCCFK* TNKLKCLIMDPN 
DVLVVLNKRTN*NV**WTPI 
28141 - TCAAACCAACGTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAAT - 28200 
-SNQRSAPRIT FGGPTDSTDN 
-QTNVVPPALHLVDPQIQLTI 
KPT*CPPHYIWWTHRFN*Q* 
28201 - AACCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCC - 2 82 60 
-NQNGGRNGARPKQRRPQGLP 
-TRMEDAMGQGQNSADPKVYP 
PEWRTQWGKAKTAPTPRFTQ 
28261 - AATAATACTGCGTCTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTC - 28320 
-NNTASWFT ALTQHG KEELRF 
I ILRLGSQLSLSMARRNLDS 
*YCVLVHSSHSAWQGGT*IP 
28321 - CCTCGAGGCCAGGGCGTTCCAATCAACACCAATAGTGGTCCAGATGACCAAATTGGCTAC - 28380 
-PRGQGVPINTNSGP DDQIGY 
-LEARAFQS TPIVVQMTKLAT 
SRPGRSNQHQ*WSR*PNWLL 
28381 - TACCGAAGAGCTACCCGACGAGTTCGTGGTGGTGACGGCAAAATGAAAGAGCTCAGCCCC - 28440 
-YRRATRRVRGGDGKMKELSP 
-TEELPDEFVVVTAK*KSSAP 
PKSYPTSSWW*RQNERAQPQ 
28441 - AGATGGTACTTCTATTACCTAGGAACTGGCCCAGAAGCTTCACTTCCCTACGGCGCTAAC - 28500 
-RWYFYYLGTGPEAS LPYGAN 
-DGTSIT*ELAQKLHFPTALT 
MVLLLPRNWPRSFTSLRR*Q 
28 501 - AAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAAAGACCACATT - 28560 
-KEGIVWVATEGALNTPKDHI 
-KKASYGLQLREP* IHPKTTL 
RRHRMGCN * GSLEYTQR PHW 
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28561 - GGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTACAACTTCCTCAAGGAACAACA - 28620 
-GTRNPNNNAATVLQLPQGTT 
-APAIIjITMLPPCYN flkbqh 
HPQS**QCCHRATTSSRNNI 

2 8 621 - TTGCCAAAAGGCTTCTACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCC - 2B680 
-LPKGFYAEGSRGGSQASSRS 
-CQKASTQREAEAAVKPLLAP 
AKRLLRRGKQRRQSSLFSLL 

28 681 - TCATCACGTAGTCGCGGTAATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCT - 28740 
-SSRSRGNSRNSTPGSSRGNS 

- HHVVAVIQEIQLLAAVGEIL 

IT* SR*FKKFNSWQQ*GKFS 
28741 - CCTGCTCGAATGGCTAGCGGAGGTGGTGAAACTGCCCTCGCGCTATTGCTGCTAGACAGA - 28800 
-PARMA SGGGETALALLLLDR 
LLEWLAEVVKLPSRYCC* TD 
CSNG*RRW*NCPRAIAARQI 
28801 - TTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACAACAACAAGGCCAAACTGTC - 28860 

- L N Q L E SKVSGKGQQQQGQTV 

- * TSLRAKFLVKANNNKAKLS 

EPA*EQSFW*RPTTTRPNCH 
28861 - ACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTCGCCAAAAACGTACTGCCACAAAA - 28920 
-TKKSAAEASKKPRQKRTATK 

- LRNLLLRHLKSIiAKNVLPQN 

*EICC*GI*KASPKTYCHKT 
2 8 921 - CAGTACAACGTCACTCAAGCATTTGGGAGACGTGGTCCAGAACAAACCCAAGGAAATTTC - 28980 
-QYNVTQAFGRRGPEQTQGNF 
STTSLKHLGDVVQNKPKEIS 
VQRHSSIWETWSRTNPRKFR 
28 981 - GGGGACCAAGACCTAATCAGACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAA - 2 904 0 
-GDQDL IRQGTDYKHWPQIAQ 

- G T K T * SDKELI TNI GRKLHN 

GPRPNQTRN*LQTLAANCTI 
29041 - TTTGCTCCAAGTGCCTCTGCATTCTTTGGAATGTCACGCATTGGCATGGAAGTCACACCT - 2 9100 

- F A P SASAFFGMSRI GMEVTP 

LLQVPLHSLECHALAWKSHL 
CSKCLCILWNVTHWHGSHTF 
29101 - TCGGGAACATGGCTGACTTATCATGGAGCCATTAAATTGGATGACAAAGATCCACAATTC - 2 9160 
-SGTWLTYHGAIKLDDKDPQF 

- R E H G * LIMEPLNWMTKIHNS 

GNMADLSWSK* IG*QRSTIQ 
29161 - AAAGACAACGTCATACTGCTGAACAAGCACATTGACGCATACAAAACATTCCCACCAACA - 2 9220 
-KDNV I LLNKHI DAY KTFP PT 
-KTTSYC*TSTLTHTKHSHQQ 
RQRHTAEQAK*RIQNI PTNR 
29221 - GAGCCTAAAAAGGACAAAAAGAAAAAGACTGATGAAGCTCAGCCTTTGCCGCAGAGACAA ~ 2 9280 
-E PKKDKKKKTDEAQPLPQRQ 
-SLKRTKRKRLMKLSLCRRDK 
A*KGQKEKD* * SSAFAAETK 
29281 - AAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGACAA - 2 9340 
-KKQPTVTLLPAADMDDFSRQ 
-RSSPIj*l»FFLRLTWMISPDN 
EAAHCDSSSCG*HG* FLQTT 
29341 - CTTCAAAATTCCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATG - 29400 
-LQNSMSGASADSTQA* TLMM 
-FKIP*VELLLIQLRHKHS** 
SKFHEWSFC*FNSGINTHDD 
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29401 - ACCACACAAGGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACGATACATAGTC - 294 60 
-TTQGRWAM*TFSQFRLRY IV 
-PHKADGLCKRFRNSVYDT*S 
HTRQMGYVNVFAI PFTIHSL 

29461 - TACTCTTGTGCAGAATGAATTCTCGTAACTAAACAGCACAAGTAGGTTTAGTTAACTTTA - 29520 

- Y S C A E * ILVTKQHK*V*LTL 

- TLVQNEFS*LNSTSRFS*L* 

LLCRMNSRN* TAQVGLVNFN 
29521 - ATCTCACATAGCAATCTTTAATCAATGTGTAACATTAGGGAGGACTTGAAAGAGCCACCA - 29580 
-ISHSNL* SMCNIREDLKEPP 
-SHIAIFNQCVTLGRT* KSHH 
LT*QSLINV*H*GGLERATT 
29581 - CATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACAGTGAATAATGCTAGGGAGAG - 29640 
-HFHRGHAEYDRGYSE*C*GE 
-I FIEATRSTIEGTVNNARES 
FS SRPRGVRSRVQ* IMLGRA 
29641 - CTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCCCCATGTG - 29700 
-LPIWKSPNV*N*F* * C Y P H V 

- CLYGRALMCKINFSSAIPM* 

AYMEEP^CVKLILVVLSPCD 
2 9701 - AT T T T AAT AG C T T C T T AGG AG AAT G AC AAAAAAAAAAAAA AA - 29742 
-ILIAS * ENDKKKKKX 

- F * *LLRRMTKKKKX 

FNSFLGE*QKKKKX 
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1 - TTTTTTTTTTTTTTTGTCATTCTCCTAAGAAGCTATTAAAATCACATGGGGATAGCACTA - 60 
-FFFFFVI LLRSY*NHMGIAL 
-FFFFLSFS*EAIKITWG*HY 
FFFFCHSPKKLLKSRGDSTT 
61 - CTAAAATTAATTTTACACATTAGGGCTCTTCCATATAGGCAGCTCTCCCTAGCATTATTC - 120 
-LKLILHI RALPYRQLSLALF 
-*N*FYTJ J GLFHIGSSP*HYS 
KINFTH*GSSI*AALPSIIH 
121 - ACTGTACCCTCGATCGTACTCCGCGTGGCCTCGATGAAAATGTGGTGGCTCTTTCAAGTC - 180 

- T V P S I VLRVASMKMWWLFQV 
-LYPRSYSAWPR*KCGGSFKS 

CTLDRTPRGLDENVVALSSP 
181 - CTCCCTAATGTTACACATTGATTAAAGATTGCTATGTGAGATTAAAGTTAACTAAACCTA -2 40 
-LPNVT H * L K I A M * D * S * LNL 
-SLMLHID*RLLCEIKVN*TY 
P * C Y T L I KDCYVRLKLTKPT 
241 - CTTGTGCTGTTTAGTTACGAGAATTCATTCTGCACAAGAGTAGACTATGTATCGTAAACG - 300 
-LVLFS YEN SFCTRV DYVS * T 
-LCCLVTRIHSAQE* TMYRKR 
CAV*LREFILHKSRLCIVNG 
301 - GAATTGCGAAAACGTTTACATAGCCCATCTGCCTTGTGTGGTCATCATGAGTGTTTATGC - 3 60 
-ELRKRLHS PSALCGHHECLC 
-NCENVYIAHLPCVVIMSVYA 
IAKTFT*PICLVWSS*VFMP 
361 - CTGAGTTGAATCAGCAGAAGCTCCACTCATGGAATTTTGAAGTTGTCTGGAGAAATCATC -4 20 

- L S * ISRSSTHGILKLSGEII 

*VESAEAPLMEF*SCLEKSS 
ELNQQKLHSWNFEVVWRNHP 
421 - CATGTCAGCCGCAGGAAGAAGAGTCACAGTGGGCTGCTTCTTTTGTCTCTGCGGCAAAGG -4 80 
-HVSRRKKS HSGLLLLSLRQR 
-MSAAGRRVTVGCFFCLCGKG 
CQPQEEESQWAASFVSAAKA 
481 - CTGAGCTTCATCAGTCTTTTTCTTTTTGTCCTTTTTAGGCTCTGTTGGTGGGAATGTTTT - 54 0 

- L S F I S LFL FVLFRLCWWECF 
-*ASSVFFFLSFLGSVGGNVL 

ELHQSFSFCPF*ALLVGMFC 
541 - GTATGCGTGAATGTGCTTGTTCAGCAGTATGACGTTGTCTTTGAATTGTGGATCTTTGTC - 600 
-VCVNVLVQQYDVVFELWIFV 
-YASMCLFSSMTLSLNCGSLS 
MRQCACSAV*RCL*IVDLCH 
601 - ATCCAATTTAATGGCTCCATGATAAGTCAGCCATGTTCCCGAAGGTGTGACTTCCATGCC - 6 60 
-I QFNG SMI SQPCSRRCDFHA 
-SNLMAP**VSHVPEGVTSMP 
PI*WLHDKSAMFPKV*LPCQ 
661 - AATGCGTGACATTCCAAAGAATGCAGAGGCACTTGGAGCAAATTGTGCAATTTGCGGCCA - 720 
-NA*HSKECRGTWSKLCNLRP 
-MRDIPKNAEALGANCAICGQ 
CVTFQRMQRHLEQIVQFAAN 
7 21 - ATGTTTGTAATCAGTTCCTTGTCTGATTAGGTCTTGGTCCCCGAAATTTCCTTGGGTTTG - 780 
-MFVISSLSD*VLVPEISLGL 
-CL*SVPCLIRSWSPKFPWVC 
VCNQFLV* LGLGPRNFLGFV 
7 81 - TTCTGGACCACGTCTCCCAAATGCTTGAGTGACGTTGTACTGTTTTGTGGCAGTACGTTT - 840 
-FWTTS PKC LSDVVLFCGSTF 
SGPRLPNA*VTLYCFVAVRF 
LDHVSQMLE*RCTVLWQYVF 
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841 - TTGGCGAGGCTTTTTAGATGCCTCAGCAGCAGATTTCTTAGTGACAGTTTGGCCTTGTTG - 900 
-LARLFRCL SSRFLS DSLAliL 
-WRGFLDASAADFLVTVWPCC 
GEAP*MPQQQIS**QFGLVV 
901 - TTGTTGGCCTTTACCAGAAACTTTGCTCTCAAGCTGGTTCAATCTGTCTAGCAGCAATAG - 960 
-LLAFTRNFALKLVQSV* Q Q * 
-CWPLPETLLSSWFNLSSSNS 
VGLYQKLCSQAGS ICLAAIA 
961 - CGCGAGGGCAGTTTCACCACCTCCGCTAGCCATTCGAGCAGGAGAATTTCCCCTACTGCT - 1020 
-REGSFTTS ASHS S R R I SPTA 
-ARAVSPPPLAI RAGEFPLLL 
RGQFHHLR* PFEQENFPYCC 
1021 - GCCAGGAGTTGAATTTCTTGAATTACCGCGACTACGTGATGAGGAGCGAGAAGAGGCTTG - 108 0 
-ARS*IS*ITATT**GARRGL 

- PGVEFLELPRLRDEEREEA* 

QELNFLNYRDYVMRSEKRLD 
1081 - ACTGCCGCCTCTGCTTCCCTCTGCGTAGAAGCCTTTTGGCAATGTTGTTCCTTGAGGAAG - 114 0 
-TAASASLCVEAFWQCCSLRK 
-LPPLLPSA*KPFGNVVP*GS 
CRLCFPLRRSLLAMLFLEEV 
1141 - TTGTAGCACGGTGGCAGCATTGTTATTAGGATTGCGGGTGCCAATGTGGTCTTTGGGTGT - 1200 

- L * HGGS IV IR IAGANVVFGC 
-CSTVAALLLGLRVPMWSLGV 

VARWQHCY* DCGCQCGLWVY 
1201 - ATTCAAGGCTCCCTCAGTTGCAACCCATACGATGCCTTCTTTGTTAGCGCCGTAGGGAAG - 12 60 
-IQGSLSCN PY DAFFVSAVGK 
-FKAPSVATHTMPSLLAP*GS 
S RLPQ LQP I RCL L C * RRREV 
1261 - TGAAGCTTCTGGGCCAGTTCCTAGGTAATAGAAGTACCATGTGGGGCTGAGCTGTTTCAT - 1320 

- * SFWASS*VIEVPSGAELFH 
-EASGPVPR* *KYHLGIiSSFI 

KLLGQFLGNRSTIWG*ALSF 
1321 - TTTGCCGTCACCACCACGAACTCGTCGGGTAGCTCTTCGGTAGTAGCCAATTTGGTCATC - 138 0 
-FAVTTTNSSGSSSVVANLVI 
-LPSPPRTRRVALR**PIWSS 
CRHHHE LVG* LFGSSQFGHL 
1381 - TGGACCACTATTGGTGTTGATTGGAACGCCCTGGCCTCGAGGGAATCTAAGTTCCTCCTT - 1440 
-WTTIGVDWNALASRESKFLL 
-GPLLVLIGTPWPRGNLSSSL 
DHYWC*LERPGLEGI*VPPC 
14 41 - GCCATGCTGAGTGAGAGCTGTGAACCAAGACGCAGTATTATTGGGTAAACCTTGGGGTCG - 1500 
-AMLSESCE PRRSI I G * T L G S 
PC * VRAVNQDAVLLGKPWGR 
HAE*EL*TK TQYYWVNLGVG 
1501 - GCGCTGTTTTGGCCTTGCCGCATTGCGTCCTCCATTCTGGTTATTGTCAGTTGAATCTGT - 1560 
-ALFWPCPIASSILVIVS*IC 
'-RCFGLAPLRPPFWLLSVESV 
AVLALPHCVLHSGYCQLNIjW 
1561 - GGGTCCACCAAATGTAATGCGGGGGGCACTACGTTGGTTTGATTGGGGTCCATTATCAGA - 1620 
-GSTKCNAGGTTLV* LGS I IR 
-GPPNVMRGAL RWFDWGPLSD 
VHQM*CGGHYVGLIGVHYQT 
1621 - CATTTTAATTTGTTCGTTTATTTAAAACAACAAGTACGTCTCTAAATGCAGCAGTTTGGT - 1680 
-HFNLFVYLKQQVRL*MQQFG 
-ILICSFI*NNKYVSKCSSLV 
F * FVRLFKTTSTSLNAAVW* 
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1681 - GACCTTCATGAAGGTACCAACACCTAGCTATAAGCGCACCACCAGCTGGATCTTGACAGT - 1740 
-DLHEGTNT*L -AHHQLDLDS 
-TFMKVPTPSYKRTTSWILTV 
PS*RYQHIiAISAPPAGS*QL 

1741 - TGATAGTAACATTAGGTGTGCATGTTTGAACCATAGTGTGCCATCTATGAAAAGGTAAAA - 1800 

- * * *H*VCMFEP*CAIYEKVK 

- DSNI RCACLNHSVPSMKR*N 

IVTLGVHV*TIVCHL*KGKT 
1801 - CCTTTCCTAGAGCACAAAGCCAAGCAGTGCTATAAGTATTACCCCTAGTGTTGTACCTTA - 18 60 

- P F L E H KAKQC Y K Y Y P * CCTL 
-LS^STKPSSAISITPSVVPY 

FPRAQSQAVL*VLPLVLYLT 
18 61 - CAAGGATCTTCAAGCACATGAGGTTTATTAGATGCACAGCGCTGTACTACAGTGCATATG - 1920 
-QGS S S T * GLLDAQRCTTVHM 
-KDLQAHEVY*MHSAVLQCIC 
RIFKHMRPIRCTALYYSAYA 
1921 - CAACTGCATAGAGAAATACAAGTCAAAACAATGAGAAGTTTCATGTTCGTTTAGACTTTG - 198 0 
-QLHRE IQVKTMRSFMFV*TL 
-NCIEKYKSKQ*EVSCSFRLW 
TA±RNTSQNNEKFHVRLDFG 
1981 - GTACAAGGTTCTTCTAGATCCTGGATTTCGAGTGAAAACCAAAATATAATAAGCATTATT - 2 04 0 
-VQGSSRSWISSENQNIISII 
-YKVLLDPGFRVKTKI * * A L L 
TRFF*ILDFE *KPKYNKHY* 
2 041 - AAAACAAGGAATAGCAGAAAGGCTAAAAAGCACAAATAGAAGTCAATTAAAGTGAGCTCA - 2100 
-KTRNSRKAKKHK*KSIKVSS 

- KQGIAERLKSTNRSQLK*AH 

NKE*QKG*KAQIEVN*SELI 
2101 - TTCATTCTGTCTTTCTCTTAATGGTGAAGCAAAGTATTAAAAATACTAGAGCAGCAACAA - 2160 
-FILSFS*W*SKVLKILEQQQ 
-SFCLSLNGEAKY*KY*SSNN 
HSVFLLMVKQ S IKNTRAATM 
2161 - TGAGAAAAAGTGGCGAGTAGAGCTCTTGTTGAACCTCCTCTTGTCTGATGAAAAGTTTTG - 2 22 0 
-*EKVASRALVEPPLV**KVL 
-EKKWRVELLLNLLLSDEKFW 
RKSGE*SSC*TSSCLMKSFG 
2221 - GTGAAACTGATCTTGCACGCAGCTGATAGGTATGTCGAGTACCGTCAGCACAAGCAAAAG - 22 8 0 
-VKLILHAADRYVEYRQHKQK 
~*N*SCTQLIGMSSTVSTSKS 
ETDLARS**VCRVPSAQAKA 
2281 - CAAAGTGTGTGCTAGTGCAAGTTAGTGCAAATTTATTGTCAGCAAGAGGGTGAAATGGTG - 234 0 

- Q S V C * CKLVQIYCQQEGEMV 
~KVCASAS*CKFIVSKRVKW* 

KCVLVQVSANLLSARG*NGE 
2341 - AATTGCCCTCGTATGTTCCTGATGGGCAAGGTTCTTTTAGTAGTACAGTCGTACCTCTAA - 2 400 
-NCPRM FLMGKVLLVVQSYL* 
-IALVCS*WARFF**YSRTSN 
LPSYVPDGQGSFSSTVVPLT 
24 01 - CACACTCCTGATAGTGATATAGCTCGCAAGATGTAAATACAATCAATGTCAGGAAGAGAA - 2 4 60 
-HTPDS DIARKM* IQSMSGRE 
-TLLIVI* LARCKYNQCQEEN 
HS***YSSQDVNTINVRKRI 
24 61 - T AATTTT CATGTTCGTT TTATGGATAATCT AACTCC ATAGGTT CTTC ATCATCT AACTCC - 252 0 

- * FSCSFYG*SNSIGSSSSNS 
-NFHVRFMDNLTP-VLHHLTP 

IFMFVLWI I*LHRFFII*LR 
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2521 - GAATAATTCTTCTTAGTTAGAGGCTTAAATAATTGTCTCACTATTGAACTTATTATAACG - 2580 

- E * FFLVRGLNNCLTIELI IT 

- N N S S *LEA*IIVSIjLNLL*R 

IILLS * f RLK*LSHY*TYYNV 
2581 - TCAAGATTCCAAATAGCAATCCTGAAAGTCCTCATAATGATAATCAATATCTCTGCTATT - 2640 
-SRFQIAI LKVLIMI I N I S A I 
-QDSK"QS*KSS~* * S I S L L L 
KIPNSNPESPHNDNQYLCYC 
2 641 - GTAACCTGGAAGTCAACAAGATGAAACATCTGTTGTCACTTACTGTACTAGCAAAGCAAT - 2700 
-VTWKS TR*NICCHLLY * Q S N 

- * PGSQQDETSVVTYCTSKAI 

NLEVNKMKHLLSLTVLAKQY 
2701 - ATTGTCGTTGCTACCGGCGTGGTCTGTATTTAATTTATAGTTTCCAATACGGTAGCGGTT - 27 60 
-IVVATGVVCI*FIVSNTVAV 
-LSLLPAWSVFNL* F P I R * R L 
CRCYRRGLYLIYSFQYGSGC 
2761 - GTATGCAGCAAAACCTGAATCAGTGCCTACACGCTGCGACGCTCCTAATTTGTAATAAGA - 2820 

- V C S K T * ISAYTLRRS* FVIR 
-YAAKPESVPTRCDAPNL* * E 

MQQNLNQCLHAATLLICNKK 
2821 - AAGCGTTCGTGATGTAGCCACAGTGATCTCTTTTGGCAGGTCCTTAATGTCACAGCGCCC - 2880 
-KRS* CSHSDLFWQVLNVTAP 
-SVRDVATVISFGRSLMSQRP 
AFVM* PQ*SLLAGP*CHSAL 
2 881 - TAGGGAGTGTCCGGCCATTCGCAAGTGACCACGAATGATCACAGCACCAATGACAAGTTC - 2 94 0 
-*GVSGHSQVTTNDH STNDKF 
-RECPAIRK*PRMI TAPMTSS 
GSVRPFASDHE* SQHQ*QVH 
2941 - ACTTTCCATGAGCGGTCTGGTCACAATTGTCCCCCGGAGAGGCACATTGAGAAGAATGTT - 3000 
-TFHE RSGHNCPPERHI EKN V 
-LSMS GLVTIVPRRGTLRRMF 
FP*AVWSQLSPGEAH*EECL 
3001 - TGTTTCTGGGTTGAATGACCACATTGAGCGGGTACGAGCAAACAGCCTGAAGGAAGCAAC - 3060 

- C F W V E * PH*AGTSKQPEGSN 
-VSGLNDHIERVRANSLKEAT 

FLG*MTTLSGYEQTA*RKQR 
3061 - GAAGTAGCTAAGCCACATCAAGCCTACAATACAAGCCATTGCAATCGCAATCCCGCCAGT - 3120 
-EVAK P HQAYNT SH CNRNPAS 
-K*LSHIKPTIQAIAIAIPPV 
SS*ATSSLQYKPLQSQSRQS 
3121 - CACCCAATTAATTCTGTAGACAACAGCAAGCACAAAACAAGCAAGTGTTACTGGCCACAA - 3180 
-HPINSVDNSKHKTSKCYWPQ 

- T Q L I L*TTASTKQASVTGHK 

P N * FCRQQQAQNKQVLLATR 
3181 - GAGCCAGAGGAAAACAAGCTTTATTATGTACAAAAACCTGTTCCGATTAGAATAGGCAAA - 32 4 0 
-EPEENKLYYVQKPVP I RIGK 
-SQRKTS FIMYKNLFRLE*AN 
ARGKQALLCTKTCSD*NRQI 
3241 TTGT AGTAACAT AATCCAGGCTAGGAAT AGGAAACCTATTACT AGGTT CCATTGT TCCAG - 3300 

- L * *HNPG*E*ETYY*VPLFQ 

- C S N I IQARNRKPITRFHCSR 

V V T * SRLGIGNLLLGS IVPG 
3301 - GAGTTGTTTAAGCTCCTCAACGGTAATAGTACCGTTGTCTGCCATGATAAGCAATGTTAA - 3360 
-ELFKLLNGNSTVVCHDKQC* 
-SCLSSSTVIVPLSAMISNVK 
VV*APQR**YRCLP**AMLK 
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3361 - AGTTCCAAACAGAATAATAATAATAGTTAGTTCGTTTAGACCAGAAGATCAGGAACTCCT - 3420 
-SSKQNNNNS* FV*TRRSGTP 
-VPNRIIIIVSSFRPEDQELL 
FQTE * * * *LVRLDQKIRNSF 

3 421 - TCAGAAGAGTTCAGATTTTTAACACGCGAGTAGACGTAAACCGTTGGTTTTACTAAACTC - 348 0 
-SEEFRFLTRE*T*TVGFTKL 

- Q K S S DF^HASRRKPLVLLNS 

RRVQ I FNTRVDVNRW FY * TH 
3481 - ACGTTAACAATATTGCAGCAGTACGCACACAATCGAAGCGCAGTAAGGATGGCTAGTGTG - 354 0 
-TLTI LQQYAHNRSAVRMASV 
-R*QYCSSTHTIEAQ*GWLV* 
VNNIAAVRTQSKRSKDG*CD 
3541 - ACTAGCAAGAATACCACGAAAGCAAGAAAAAGAAGTACGCTATTAACTATTAACGTACCT - 3600 
-TSKNTTKARKRSTLLTINVP 

- L A R I PRKQEKEVRY*LLTYL 

*QEYHESKKKKYAINY*RTC 
3 601 - GTTTCTTCCGAAACGAATGAGTACATAAGTTCGTACTCACTTTCTTGTGCTTACAAAGGC - 3660 

- V S SETNEY I S SYSLSCAYKG 
-FLPKRMST * V R T H FLVLTKA 

F F R N E*VHKFVLTFLCLQRH 
3 661 - ACGCTAGTAGTCGTCGTCGGCTCATCATAAATTGGATCCATTGCTGGATTAGCAACTCCT - 3720 
~TLVVVVGSS*IGSIAGLATP 
-R**SSSAHHKLDPLLD*QLL 
ASSRRRLIINWIHCWISNS* 
3721 - GAAGAGCCGTCGATTGTGTGTATTTGCACATTCGGTGGGTCTTTAACAAGCTTGTTAAAG - 37 8 0 
-EEPS IVCICTFGGSLTSLLK 
~KSRRLCVFAHSVGL*QAC*R 
RAVDCVYLHIRWVFNKLVKD 
3781 - ATGAAGAATGTAGCATTTTCAATACCAGTGTCTGTAGTAATTTGTGTAGACTCAAGCTGG - 384 0 
-MKNVAFS I PVSVVICV DSSW 
-*RM* HFQYQCL** F V * TQAG 
EECS IFNTSVCSNLCRLKLV 
3841 - TAGTAAACTTCGGTGAAATAGCCATGTACAACGACATAGTCTTTAACACCTGAGTGCCTA - 3900 
-**TSVK* PCTTT*SLTPECL 

- SKLR*NSHVQRHSL*HLSAY 

VNFGEIAMYNDIVFNT*VPI 
3 901 - TCCTCAGAATAACCACCAATTTGGTAGTCTTCTTTGAGTTTTGGTGTTGAAATGCCGTCA - 3960 

- S S E * PPIW*SSLSFGVEMPS 
-PQNNHQFGSLL*VLVLKCRH 

LRITTNLVVFFEFWC*NAVT 

3 961 - CCTTCAGTAACGACAATTGTATCTGTGACACTGTTATATGGTATACAGTAGTCATAGTTA - 4 020 

-PSVTTIVSVTLLYGIQ*S*L 
-LQ*RQLYL *HCYMVYSSHSY 
FSNDNCICDTVIWYTVVIVM 

4 021 - TGTGTGTGCCAGCAAACAAAGTAGTTGGCATCATAAAGTAATGGGTTCTTGGATTTGCAC - 4080 

-CVCQQTK* LAS * SNGFLDLH 
-VCASKQSSWHHKVMGSWICT 
CVPANKVVGIIK*WVLGFAL 
4 081 - TTCCAACAAAGCCAACATCTCATAATAATTCTACATGCGTTGATGCATTGTAGAAAATAT - 414 0 
-FQQSQHLI I ILHALMHCRKY 

- SNKANIS* *FYMR*CIVENI 

PTKPTSHNNSTCVDAL* KIY 
4141 - ATCAAGGCATAGAGGTACAAAAATTGCGCCTCCTTACGTGCAGCGACAAGCAAAAGATGT - 42 00 

- I K A * RYKNCASLPAAT SKRC 
-SRHRGTKIAPPYLQRQAKDV 

QGIEVQKLRLLTCSDKQKM* 
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4201 - GAATAGATGGTAACAAATAGCAGCAGTAAATTGCAAATGAAGTGGAAGCCCTTATAAAGG - 4260 
-E*MVTNSSSKLQMNWKPL*R 
-NRW*QIAAVNCK*TGSPYKG 
IDGNK*QQ* IANELEALIKG 
4261 - GCTAGCTGCCATCTTTTATTGAGCGCAATTATTTTGGTAGCGCTCTGAAAAACAGCAAGA - 4 32 0 
-ASCHLLLSAI ILVAL*KTAR 
-LAAIFY* AQLFW*RSEKQQE 
* LPS FIERNYFGSALKNSKK 
4321 ~ AATGCAACGCCAATAACAAGCCATCCGAAAGGGAGTGAGGCTTGTAGCGGTATCGTTGCT - 4380 

- N A T P I TSHPKGSEACSGIVA 

- M Q R Q * QAIRKGVRLVAVS LL 

CNANNKPSERE* GL*RYRCC 
4 381 - GTAGCATGAACAGTACTTGCAGGAGAAGCATTGTCAATTTTTACTGGCTGTGCAGTAATT - 4 44 0 

- V A * T V LAGEALS I FTGCAVI 
<- * HEQYLQEKHCQFLLAVQ*L 

SMNSTCRRSIVNFYWLCSN* 
4441 - GATCCAAGAGTAAAAAATCTCATAAACAAATCCATAAGTTCGTTTATGTGTAATGTAATT - 4500 
-DPRVKNLINKSISSFMCNVI 
-IQE*KIS*TNP*VRLCVM*F 
SKSKKSHKQIHKFVYV* CNL 
4 501 - TGACACCCTTGAGAACTGGCTCAGAGTCATCCTCATCAAACTTGCAGCAAGAACCACAAG - 45 60 

- * HP*ELAQSHPHQTCSKNHK 
-DTLENWLRVILIKLAARTTR 

TPLRTGSESSSSNLQQEPQE 
4 561 - AGCATGCACCCTTGAGGCAACTGCAACAACTAGTCATGCAACAAAGCAAGATTGTAACCA - 4 620 
-SMHP* GNCMN * SCNKARL* P 
-ACTLEATATTSHATKQDCNH 
HAPLRQLQQLVMQQSKIVTM 
4 621 - TGACGATGGCAATTAGTCCAGCAATGAAGCCGAGCCAAACATACCAAGGCCATTTAATAT - 4 680 
-*RWQLVQQ*SRAKHTKAI*Y 
-DDGN*SSNEAEPNI PRPFNI 
TMAISPAMKPSQTYQGHLIY 
4 681 - ATTGCTCATATTTTCCCAATTCTTGAAGGTCAATGAGTGATTCATTTAAATTTTTAGCGA - 4 74 0 
-IAHIFPILEGQ*VIHLNF*R 
-LLIFSQFLKVNE*FI* IFSD 
CSYFPNS*RSMS DSFKFLAT 
4741 - CCTCATTGAGGCGGTCAATTTCTTTTTGAATGTTGACGACAGAAGCGTTAATGCCTGAAA - 4 8 00 
-PH * GGQFLFEC* RQKR* CLK 
-LIEAVNFFLNVDDRSVNA*N 
SLRRSISF*MLTTEAI»M.PEM 
4801 - TGTCGCCAAGATCAACATCTGGTGATGTATGATTTTTGAAGTACTTGTCCAGCTCTTCTT - 4 860 
-CRQDQHLVMYDF* STCPAI>L 
-VAKINIW*CMI FEVLVQLFF 
SPRSTSGDV^FLKYLSSSSL 
4861 - TGAATGAGTCAAGCTCAGGTTGCAGAGGATCATAAACTGTGTTGTTAATGATGCCAATAA - 4 92 0 

- * MSQAQVAEDHKLCC * * C Q * 

- E *VKLRliQRI INCVVN DANN 

NESSSGCRGS*TVLLMMPIT 
4 921 - CGACATCACAATTTCCTGAGACAAATGTATTGTCTGTAGTAATTATTTGTGGAGAAAAGA - 4 98 0 
-RHHNFLRQMYCL * * L F V E K R 

- DITIS*DKCIVCSNYLWRKE 

TSQFPETWVLSVVIICGEKK 
4 981 - AGTTCCTCTGTGTAATAAACCAAGAAGTGCCATTAAACACAAAAACACCTTCACGAGGGA - 504 0 

- S S S V * * T K K C H * TQKHLHEG 
-VPLCNKPRSAIKHKNTFTRE 

FLCVINQEVPLNTKTPSRGK 
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5041 - AGTATGCTTTGCCTTCATGACAAATTGCTGGCGCTGTGGTGAAGTTCCTCTCCTGGGATG - 5100 
-SMLCLHDKLLALW* SSSPGM 
-VCFAFMTNCWRCGEVPLLGW 
YALPS*QIAGAVVKFLSWDG 

5101 - GCACATACGTGACATGTAGGAAGACAACACCATGCGGGGCTGCTTGTGGGAAGGACATAA - 5160 

- A H T * HVGRQHHAGLLVGRT* 
-HIRDM*EDNTMRGCLWEGHK 

TYVTCRKTTPCGAACGKDIR 
5161 - GGTGGTAGCCCTTTCCACAAAAGTCAACTCTTTTTGATTGTCCAAGAACACACTCAGACA - 5220 
-GGSPFHKSQLFLIVQEHTQT 
-VVALSTKVNSF - * LSKNTLRH 
W * PFPQKSTLFDCPRTHSDI 
5221 - TTTTAGTAGCAGCAAGATTAGCAGAAGCCCTGATTTCAGCAGCCCTGATTAGTTGTTGTG - 52 80 

- F * *QQD*QKP*FQQP*LVVV 
-FSSSKISRSPDFSSPD^LLC 

LVAARLAEALI SAALI SCCV 
5281 - TTACATAGGTTTGAAGGCTTTGAAGTCTGCCTGTAATTAACCTGTCAATTTGTACCTCCG - 5340 
-LHRFEGFEVCL* LTCQFVPP 
-Y IGLKALKSACN* PVNLYLR 
T*V*RL*SLPVINLSICTSA 
5341 - CCTCGACTTTATCAAGTCGCGAAAGGATATCATTTAGCACACTTGAAATTGCACCAAAAT - 54 00 
-PRLYQVAKGYHLAHLK1, HQN 
-LDFIKSRKDII*HT*NCTKI 
STLSSRERISFSTLEIAPKL 
5401 - TAGAGCTAAGTTGTTTAACAAGTGTGTTTAATGCTTGAGCATTGTGGTTAACAACGTCTT - 5460 

- * S *VV*QVCLMLEHSG*QRL 
-RAKLFNKCV*CLS ILVNNVL 

ELSCLTSVFNA*AFWLTTSC 
54 61 - GCAGCTTGCCCAATGCAGTTGATGTTGTTGTAAGTGATTCTTGAATTTGACTAATCGCCT - 5520 
-AACPMQLMLL*VILEFD*SP 
-QLAQCS*CCCK* FLNLTNRL 
SLPNAVDVVVSDS*I*LIAL 
5521 - TGTTAAATTGGTTGGCGATTTGTTTTTGGTTCTCATAGAGAACATTTTGGGTAACTCCAA - 558 0 

- G * IGWRFVFGSHREHFG*LQ 
-VKLVGDLFLVLIENILGNSN 

LNWLAICFWFS*RTFWVTPM 
5581 ~ TGCCATTGAACCTATATGCCATTTGCATAGCAAAAGGTATTTGAAGAGCAGCGCCAGCAC - 564 0 
-CH*TYMPFA*QKVFEEQRQH 

- A I EPICHLHSKRYLKSSAST 

PLNLYAICIAKGI*RAAPAP 
5 641 - CAAATGTCCATCCAGCAGTGGCAGTACCACTAACTAGAGCAGCAGTGTAGGCAGCAATCA - 57 00 
-QMS IQQWQYH* LEQQCRQQS 
-KCPSSSGSTTN*SSSVGSNH 
NVHPAVAVPLTRAAV*AAII 
5701 - TATCATCAGTGAGCAGAGGTGGCAACACTGTAAGTCCATTGAACTTCTGCGCACAAATGA - 57 60 
-YHQ*AEVATL*VH*TSAHK* 
II SEQRWQHCKS IELLRTNE 
S SVSRGGNTVS PLNFCAQMR 

57 61 - GATCTCTAGCATTAATATCACCTAGGCATTCGCCATATTGCTTCATGAAGCCAGCATCAG - 5820 

-DL*H*YHLGIRHIAS*SQHQ 
-ISSINIT*AFAILLHEASIS 
SLALISPRHSPYCFMKPASA 

58 21 - CGAGTGTCACCTTATTAAAGAGCAAGTCCTCAATAAAAGACCTCTTAGTTGGCTTTAGAG - 58 80 

-RVSPY*RASPQ*KTS*LALE 
-ECHLIKEQVLNKRPLSWL*R 
SVTLLKSKSSIKDLLVGFRG 
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5881 - GGTCAGGTAATATTTGTGAAAAATTAAAACCACCAAAATATTTCAAAGTTGGGGTTTTGT - 5940 

- G Q V I FVKN*NHQNI S K L G F C 

- V R * YL*KIKTTKIFQSWGFV 

SGNI CEKLKPPKY FKVGVLY 
5941 - ACATTTGTTTGACTTGAGCGAACACTTCACGTGTGTTGCGATCCTGTTCAGCAGCAATAC - 6000 

- T F V * LERTLHVCCDPVQQQY 

HLFDLSEHFTCVAILFSSNT 
ICLT A ANTSRVLRSCSAAIP 
6001 - CTGAGAGTGCACGATTTAGTTGTGTGCAAAAGCTACCATATTGGAGAAGCAAATTAGCAC - 60 60 
-LRVHDLVVCKSYHI G E A N * H 
*ECTI*LCAKATILEKQIST 
E S A R FSCVQKLPYWRSKLAH 
6061 - ATTCAGTAGAATCTCCGCAGATGTACATATTACAATCTACGGAGGTTTTAGCCATAGAAA - 6120 
-IQ*HLRRCTYYNLRRF*P*K 

- FSRISADVHITIYGGFSHRN 

SVES PQMYILQSTEVLAIET 
6121 - CAGGCATTACTTCTGTAGTAATGCTAATTGAAAAGTTAGTAGGTATAGCAATGGTGTTAT - 6180 

- Q A L L L * *C*LKS**V*QWCY 
-RHYFCSNAN*KVSRYSNGVI 

GITSVVMLIEKLVGIAMVLL 
6181 - TAGAGTAAGCAATTGAACTATCAGCACCTAAAGACATAGTATAAGCCACAATAGATTTTT - 62 40 

- * SKQLNYQHLKT * Y K P Q * IF 
~RVSN*TIST*RHSISHNRFL 

E*AIELSAPKDIV*ATIDFW 
6241 - GGCTAGTACTACGTAATAAAGAAACTGTATGGTAACTAGCACAAATGCCAGCTCCAATAG - 6300 
-G*YYVIKKLYGN*HKCQLQ* 

- A S T T **RNCMVTSTNASSNR 

LVLRNKETVW^LAQMPAPIG 
6301 - GAATGTCGCACTCATAAGAAGTGTCGACATGCTCAGCTCCTATAAGACAGCCTGCTTGAG - 6360 
-ECRTRKKCRHAQLL* DSLLE 
-NVAL IRSVDMLSSYKTACLS 
MSHS*EVSTCSAPIRQPA*V 
6361 - TCTGGAATACATTGTTTCCAGTAGAATATATGCGCCAAGCTGGTGTGAGTTGATCTGCAT - 6420 
-SGIHCFQ*NICAKLV*VDLH 

- LEYIVSSRIYAPSWCELICM 

WNTL FPVEYMRQAGVS* SA* 
6421 - GAATTGCTGTAGAAACATCAGTGCAGTTAACATCTTGATATAGAACAGCAACTTCAGATG - 6480 
-ELL*KHQCS*HLDIEQQLQM 
-NCCRNISAVNILI*NSNFR* 
IAVE TSVQLTS* YRTATSDE 
6481 - AAGCATTTGTTCCAGGTGTAATTACACTTACACCCCCAAAAGAGCAAGGTGAAATGTCTA - 654 0 
-KHLFQV* LHLHPQKSKVKCL 
-SICSRCNYTYTPKRAR*NV* 
AFVPGV.ITLTPPKEQGEMSN 
6541 - ATATTTCAGATGTTTTAGGATCTCGAACGGAATCAGTGAAATCAGAAACATCACGGCCAA - 6600 

- I FQMF * DLBRNQ*NQKHHGQ 
-YFRCFRISNGISEIRNITAK 

I SDVLGSRTESVKSETSRPN 
6601 - ATTGTTGAAATGGTTGAAATCTCTTTGAAGAAGGAGTTAACACACCAGTACCAGTGAGTC - 6660 
-I VEMVEI SLKKELTHQYQ*V 
-LLKWLKSL*RRS*HTSTSES 
C*NG*NLFEEGVNTPVPVSP 
6661 - CATTAAAATTAAAATTGACACACTGGTTCTTAATAAGGTCAGTGGATAATTTTGGTCCAC - 6720 
-H*N*N*HTGS**GQWIILVH 

- I K I K I DTLVLNKVSG* FWST 

LKLKLTHWFLIRSVDNFGPQ 
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6721 - AAACCGTGGCCGGTGCATTTAAAAGTTCAAAAGAAAGTACTACAACTCTGTAAGGTTGGT - 6780 
-KPWPVHLKVQKKVLQLCKVG 
-NRGRCI*KFKRKYYNSVRLV 
TVAGAFKSSKESTTTL*GW* 

67 81 - AGCCAATGCCAGTAGTGGTGTAAAAACCATAATCATTTAATGGCCAATAACAATTAAGAG - 6340 

- S Q C Q *WCKNHNHLMANNN*E 
-ANASSGVKTIII W P I T IKS 

PMPVVV*KP*SFNGQ*QLRA 

68 41 - CAGGTGGGGTGCAAGGTTTGCCATCAGGGGAGAAAGGCACATTAGATATGTCTCTCTCAA - 6900 

- Q V G C KVCHQGRKAH * IGLSQ 

- RWGARFAIRGERHIRYVSLK 

GGVQGLPSGEKGTLDMSLSK 
6901 - AGGGCCTAAGCTTGCCATGTCTAAGATACCTATATTTATAATTATAATTACCAGTTGAAG - 6960 
-RA A " A C H V * DTYI YNYNYQLK 
-GPKLAMSKIPIFIIIITS*S 
GLSL PCLRYLYI»*L*LPVEV 
6961 - TAGCATCAATGTTCCTAGTATTCCAAGCAAGGACACAACCCATGAAATCATCTGGCAATT - 7020 

- * H Q C S * Y SKQGHNP*" NHLAI 

- SINVPSIPSKDTTHBIIVJQF 

ASMFLVFQARTQPMKSSGNL 
7021 - TATAATTATAATCAGCAATAACACCAGTTTGTCCTGGCGCTATTTGTCTTACATCATCTC - 7080 
-YNYHQQ* HQFVLALFVLHHL 

- IIIISNNTSLSWRYLSYIXS 

^L^SAITPVCPGAICLTSSP 
7081 - CCTTGACTACAAAAGAATCTGCATAGACATTGGAGAAGCAAAGATCATTCAACTTAGTGG - 7140 
-P*LQKNLHRHWRSKDHST*W 
-LDYKRICIDIGEAKI IQLSG 
LTTKESA*TLEKQRSFNLVA 
7141 - CAGAAACGCCATAGCACTTAAAGGTTGAAAAAAATGTTGAGTTGTAGAGCACAGAGTAAT - 7200 
-QKRRST * RLKKMLSCRAQSN 
-RNAIALKG*KKC*VVEHRVI 
ETP*HLKVEKNVEL*STE*S 
7201 - CAGCAACACAATTAGAAATTTTTTTTCTCTCCCATGCATAGACAGAAGGGAATTTAGTAG - 7260 
-QQHN * KFFFSPMHRQKGI* * 
-SNTIRNFFSLPCIDRREFSS 
ATQLEIFFLSHA*TEGNLVA 
7261 - CATTAAAAACCTCTCCAAAAGGACACAAGTTTGTAATATTAGGGAATCTCACAACATCTC - 7320 

- H * KP L QKDT S L* Y * G I SQH L 
-IKNLSKRTQVCNIRESHNIS 

LKTSPKGHKFVI LGNLTTSP 
7 321 - CTGAGGGAACAACCCTGAAATTAGAGGTCTGGTAAATTCCTTTGTCAATCTCAAAGCTCT - 7380 
-LREQP*N *RSGKFLCQSQSS 
* G N N PE I RGLVNSFVNLKAL 
EGTTLKLEVW*I PLSISKLL 
7381 - TAACAGAGCATTTGAGTTCAGCAAGTGGATTTTGAGAACAATCAACAGCATCTGTGATTG - 7440 

- * Q S I * VQQVDFENNQQHL*L 
-NRAFEFSKWILRTINSICDC 

TEHLSSASGF*EQSTASVIV 
7 441 - TACCATTTTCATCATACTTGAGCATAAATGTAGTTGGCTTTAAATAGCCAACAAAATAGG - 7 500 
-YHFHHT*A*M*LALNSQQNR 

- TIFI ILEHKCSWL* IANKIG 

PFSSYLSINVVGFK*PTK*A 
7501 - CTGCAGCTGACGTGCCCCAAATGTCTTGAGCAGGTGAAAAGGCTGTAAGAATGGCTCTAA - 7560 
-LQLTCPKCLEQVKRL*EWL* 
-CS *RAPNVLSR* KGCKNGSK 
AADVPQM S*AGEKAVRMALK 
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7561 - AATTTGTAATGTTAATACCAAGAGGCAACTTAAAAATAGGTTTCAAAGTGTTAAAAGGAG - 7620 
-NL*C*YQEAT J ~'K*VSKC*NQ 

- I CNVMTKRQLKNRFQSVKTR 

FVMLIPRGNLKIGFKVLKPE 
7621 ~ AAGGTAGATCACGAACTACATCTATAGGTTGATAGCCCTTATAAACATAGAGAAACCCAT - 7680 
-KVDHELHL * VDS PYKHRETH 
-R*ITNYIYRLIALINIEKPI 
GRSRTTSIG**PL*T*RNPS 
7 681 - CTTTATTTTTAAACACAAACTCTCGTAAGTGTTTAAAATTACCTGACTTTTCTGAAACAT - 77 40 
-IiYF* TQTLVSV* NYLTFIiKH 
-FIFKHKLS*VFKIT*LF*NI 
LFLNTNSRKCLKLPDFSETS 
7741 - CAAGCGAAAAGGCATCAGATATGTACTCGAAAGTGCAATTAAATGCATTATCGAATATCA - 78 00 
-QAKRHQI CTRKCN * M H Y R I S 
-KRKGIRYVLESAIKCIIEYH 
SEKASDMYSKV QLNALSNII 
78 01 - TAGTATGTGTCTGTGTACCCATGGGTTTAGAAACAGCAAAGAAAGGGTTGTCACACAATT - 78 60 
-*YVSVYPWV* KQQRKGCHTI 
SMCLCTHGFRNSKERVVTQF 
VCVCVPMGLETAKKGLSHNS 
7 8 61 - CAAAGTTACATGCTCGTATAACAACATTAGTAGAATTGTTAATAATAATCACCGACTGTG - 7 920 
-QSYMLV*QH**NC***SPTV 

- KVTCSYNNISRIVNNNHRL* 

KLHARITTLVELLII ITDCD 
7 921 - ACTTGTTGTTCATGGTAGAACCAAAAACCCAACCACGGACAACATTTGATTTCTCTGTGG - 7 980 
-TCCSW^NQKPNHGQHLISLW 
-LVVHGRTKNPTTDNI * FLCG 

LLFMVEPKTQPRTTFDFSVA 

7 981 - CAGCAAAATAAATACCATCCTTAAAAGGTATGACAGGGTTGCCAAACGTATGATTAATAG - 8040 

-QQNKYHP*KV*QGCQTYD* * 
•-SKINTILKRYDRVAKRMINS 
A K * I PSL'KGMTGLPNV* LIV 
8041 - TATGAAACCCTGTAACATTAGAATAAAATGGAAGAAATAAATCCTGAGTTAAATAAAGAG - 8100 
-YETL* H*NKMEEIWPELNKE 
-MKPCNIRIKWKK*ILS*IKS 
*NPVTLE*NGRNKS* V K * R V 
8101 - TGTCTGATCTAAAAATTTCATCAGGATAGTAAACCCCCCTCATAGATGAAGTATGTTGAG - 8160 

- C L I * KFHQDSKPPS * M K Y V E 
-V*SKNFIRIVNPPHR*SMLS 

SDLKISSG**TPLIDEVC*V 
8161 ~ TGTAATTAGGAGCTTGAACATCATCAAAAGTGGTGCACCGGTCAAGGTCACTACCACTAG - 8220 
-CN*ELEHHQKWCTGQGHYH* 
-VIRSLNIIKSGAPVKVXTTS 
*LGA*TSSKVVHRSRSLPLV 
8221 - TGAGAGTAAGAAATAATAAGAAAATAAACATGTTCGTTTAGTTGTTAACAAGAATATCAC - 82 8 0 

- * E * E I IRK* TCSFSC* QEYH 
-ESKK* * ENKHVRLVVNKNIT 

RVRNNKKINMFV* LLTRISL 
8281 - TTGAAACCACAACTCTGTTGTTTTCTCTAATGATAAGCCTACCTTTTTCCAGAAGAGAAT - 8340 
-LKPQLCCFL* **AYLFPEEN 

- *NHN SVVFSNDKPTFFQKR1 

ETTTLLFSLMISLPFSRRE* 

8 341 - AAATCATATCATTGATTTGATTCTCCTTAAGAGACATTACAGCAGTTCCTCTTAATTTAA - 8 400 

-KSYH*FDSP*ETLQQFLLI* 
-NHII DLILLKRHYSSSS* FK 
IISLI*FSLRDITAVPLNLR 
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' 8401 - GAGGAAATTTGCTCATGTCAAAGAGTGAATAGGAAGACAACTGGATAGGATTTGTGTTCC - 8460 
-EEICSCQRVNRKTTG* DLCS 
-RKFAHVKE* IGRQLDRICVP 
GNLLMSKSE*EDNWIGFVFL 
84 61 - TCCAGAAAATGTAGTTAGCATGCATGGTATAGCCATCAATTTGTTCCTTCGGCTTGCCAA - 8520 
-SRKCS*HA'WYSHQFVPSACQ 

- PENVVSMHGIAIHLFLRLAK 

QKM* L A C M V * PS ICSFGLPR 
8521 - GATAGTTAGCCCCAATTAAAAATGCTTCCGATGATGATGCATTTACATTTGTAACAAAAG - 85 80 

- D S * PQLKMLPMMMHLHL* QK 
-IVSPN*KCFR**CIYICNKS 

* LAP IKNAS DDDAFTFVTKA 
8581 - CTGTCCACCATGAGAAATGGCCCATAAGCTTGTAAAGGTCAGCATTCCAAGAATGCTCTG - 8 640 
-LSTMRNGP*ACKGQHSKNAL 
-CPP*EMAHKLVKVSIPRMIjC 
VHHEKWPISL*RSAFQECSV 
8 641 - TTATCTTTACAGCTATAGAACCACCCAGGGCTAGTTTTTGCTTTATAAATCCACACAGAT - 8700 

- L S LQL*NHPGLV F A L * IHTD 
-YLYSYRTTQG* FLLYKSTQI 

IFTAIEPPRASFCFINPHR* 
8701 - AAGTGAAAAACCCTTCTTTAGAGTCATTCTCTTTTGTCACATGTTTGGTCCTAGGGTCAT - 87 60 
-K*KTLL* SHSLLSHVWS*GH 
-SEKPFFRVILFCHMFGPRVI 
VKNPSLESFSFVTCIiVLGSY 
87 61 - ACATATCGCTAATAATAAGGTCCCATTTATTAGCCGTATGTACTGTTGCACAGTCTCCAA - 8 820 

- T Y R * * * G ? I Y * PYV LLHSLQ 
-HIANNKVPFISRMYCCTVSN 

ISLIIRSHLLAVCTVAQSPI 
8 821 - TTAAAGTAGAATCTGCGTCGGAGACGAAGTCATTAAGATCTGAATCGACAAGTAGTGTGC - 8 880 
-LK*NLRRRRSH* DLNRQVVC 
-*SRICVGDEVIKI*IDK*CA 
KVESASETKSLRSESTSSVP 
8 881 - CAGTTGGCAACCATTGTCTGAGCACAGCTGTACCTGGTGCAACTCCTTTATCAGAGCCAG ~ 8 94 0 
-QLATIV* AQLYLVQLLYQSQ 
-SWQPLSEHSCTWCNSFIRAS 
VGNHCLSTAVPGATPLSEPA 
8 941 - CACCAAAGTGAATAACTCTCATGTTGTAGGGTACAGCTAAAGTAAGTGTATTTAAGTATT - 9000 
-HQSE*LSCCRVQLK*VYLSI 
-TKVNNSHVVGYS*SKCI*VL 
PK*ITLML*GTAKVSVFKY* 
9001 - GACACAGTTGAGTATACTTTGCGACATTCATCATTATTCCTTTTGGTATAACAGCATTTT - 9060 
-DTVEYTLRHSSLFLLV*QHF 

- TQLS ILCDIHHYSFWYNSIF 

HS*VYFATFIII PFGITAFS 
9061 - CACCATAATTCTGAAGGTCACACTTTTCAAGAAGCATTCTTTGCATCTTGTACAAGTTAG - 9120 
-HHNSEGHTFQEAFFASCTS* 

- T I ILKVTLFKKHSLHLVQVR 

p*F*RSHFSRSILCILYKLG 
9121 - GCATCGCAACACCTGGTTGCCACGCTTGACTTGCTTGTAGTTTTGGGTAGAAGGTTTCAA - 9180 
-ASQHLVATLDLLVVLGRRFQ 

- HRNTWLPRLTCL^FWVEGFN 

3ATPGCHA*LACSFG*KVST 
9181 - GATGTCCATCCTTACACCAAAGCATGAATGAAATTTCAGCATAGTCAATTGTAACCTTGA - 924 0 
-HVHPYTKA*MKFQHSQL* P * 
-MSILTPKHE*NFSIVNCNLD 

CPSLHQSMNEISA*SIVTLT 
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9241 - CCACTTTTGAAATCACTGACAAATCTTGTGACTTTATTATCTCGACAAAGTCATCAAGTA - 9300 
-PLLKSLTNLVTLLSRQSHQV 
-HF*NH *QIL*I>YYLDKVIK* 
TFEI TDKSCDFII STKSSSK 

9301 - AAAGATCAATCACAGAACACACACATTTTGATGAACCTGTTTGCGCATCTGTTATGAAGT - 93 60 
-KDQSQNTHILMNLFAHLL*S 
-KINHRTHTF* A *TCLRICYEV 
RS ITEKTHFDE PVCASVMK* 

9361 - AATTTTTCACTGTGCTGTCCATAGGGATAAAATCCTCTAATTTAAGTGGTGAATCTTGTG - 9420 
-NFSLCCP*G*NPLI * V V N L V 

- I FHCAVHRDKIL* F K W * I L * 

FFTVLSIGIKSSNLSGESCE 
9421 - AGCGCTTGGCTAAGCCTATCATTAAATGAAGACCGCCAAGTTGTCCATGACTGAAATCTC - 9480 
-SAWL SLS LNEDRQVVH D-NL 
-ALG*AYH*MKTAKLSMTEIS 
RLAKPIIK*RPPSCP*LKSP 
9481 - CATAAACGATGTGTTCGAAGGCATAGCCCTCGAGCTTATATCGCTGTATGAATTCATCCA - 9540 
-HKRCVRRHSPRAYIAV* IHP 

- INDVFEGIALELISLYEFIH 

*TMCSKA*PSSLYRCMNSSI 
9541 - TAGCGAGCTCGAGAAAGTCAGTTTCCATTTGTGATCTGGGCTTAAAATCCTCTAAGTCTC - 9600 

- * RARESQFPFVIWA * N P L S L 
-SELEKVSFHL*SGI»KIL*VS 

ASSRKSVSICDLGLKSSKSL 
9601 - TGCTCTGAGTAAAGTAGGTTTCAGGCAACTGTTGAATAATGCCGTCTACTTTCTTAAAGT - 9660 
-CSE* SRFQATVE * C R L L S * S 
-AliSKVGFRQLLNNAVYFLKV 
L*VK*VSGNC*IMPSTFLK* 
9661 - AGTTAAACTGTGTTTTTACTGATTCTCCAATTAATGTGACTCCATTGACGCTAGCTTGTG - 9720 
-S*TVFLLILQLM*LH*R*LV 
-VKLCFY*FSN*CDSIDASLC 
LNCV FTDSPINVTPLTLACA 
9721 - CTGGTCCCTTTGAAGGTGTTAGACCTTTGACTGAACCTTCTGTTATTAAAACACCATTAC - 978 0 
-LVPLKVLDL* LNLLLLKHHY 

- W S L ^ RC*TFD*TFCY*NTIT 

GPFEGVRPLTEPSVIKTPLR 
9781 - GGGCGTTTCTAAAAAGGTCTACCTGTCCTTCCACTCTACCATCAAACAAGACAGTAAGTG - 9840 

- G R F * KGLPVLPLYHQTRQ*V 
-GVSKKVYLSFRSTIKQDSK* 

AFLKRSTCPSTLPSNKTVSE 
9841 - AAGAACAAGCACTCTCAGTAGGTTTCTTGGCAATGTCAGTCATTGTGCAGACACCTATTG - 9900 
-KNKHSQ*VSWQCQSLCRHLL 

- R T S T LSRFLGNVSHCADTYC 

EQALSVGFLAMSVIVQTPIV 
9901 - TAGATACATGTGCTGGGGCTTCTCTTTTGTAGTCCCAGATTACAGTATTAGCAGCGATAT - 9960 

- * IHVLGLLFCSPRLQY* QRY 
-RYMCWGFSFVVPDYSISSDI 

DTCAGASLL*SQI TVLAAIS 
9961 - CAACACCCAAATTATTGAGTATCTTAATCTCTGGCACTGGTTTAATGTTACGCTTAGCCC - 1002O 
-QHPNY*VS*SIjALV*CYA*P 

- N T Q I IEYLNLWHWFNVTLSP 

TPKLLSILISGTGLMLRLAQ 
10021 - A A AG C T C A AAT GC AAC A T T A AC AG G AAG TG TTGTCTTATTTT C AAAG AT C T C C AC AT C A A - 1008O 
-KAQMQH * QEVLS YFQRS PHQ 
KLKCNINRKCCLIFKDLHIN 
SSNATLTGSVVLFSKISTSI 
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10081 - TACCATCTACCTTTGTGTAAACAGCATTATTAATGATGGAAACAGGTGCTTCGCCGGCGT - 1014 0 
-YHLPLCKQHY* * WKQVLRRR 
-TIYLCVNSIINDGTSIRCFAGV 
P S T F V * TALLMMETGASPAC 
10141 - GTCCATCAAAGTGTCCTTTATTAACAACATTATAAGCCACATTTTCTAAACTCTGTAACC - 10200 
-VHQSVLY -QHYKPH FLNSVT 
-SIKVSFINNIISHIF*TL*P 
PSKCPLLTTL*ATFSKLCWL 
10201 - TGGTAAATGTATTCCACAGGTTATAAGTATCAAATTGTTTGTAAATCCATAGGCTAAATC - 10260 

- W * M Y STGYKYQIVCKS I G * I 
-GKCIPQVISIKLFVNP*AKS 

VNVFHRL*VSNCL* IHRLNP 
10261 - CAGCAGAAATCATCATATTATATGCATCCAAGTACTGTCGGTACTCATTTGCATGGTGTC - 10320 
-QQKS SYYMHPSTVGTH LHGV 
-SRNHHI ICIQVLSVLICMVS 
AEI I ILYASKYCRYSFAWCL 
10321 - TGCAAACAGCACCACCTAAATTGCATCGTGTAATACACGTAGCAGATTTGAGTGGAACAT - 10380 
-CKQHHLNCIV*YT*QI * V E H 

- A N S T T * IASCNTRSRFEWNI 

QTAPPKLHRVIHVADLSGT* 
10381 - AATCAATATCCGACACTACTTGTTTGCCATGAGACTCACAAGGACTATCAGAATAGTAAA - 104 40 

- N Q Y P T LLVCHET HKDYQNSK 

- INIRHYLFAMRLTRXIRIVK 

SISDTTCLP*DSQGLSE**K 
10441 - AGAAAGGCAATTGCTTTAAATTAGTAAATGCACTTTTATCGAAAGCTGGAGTGTGGAATG - 10500 

- R K A I A L.N * *MHFYRKLECGM 
-ERQLL* ISKCTFIESWSVEC 

KGNCFKLVNALLS KAGVWNA 
10501 - CATGCTTATTCACATACAAACTACCACCATCACAGCCTGGTAAGTTGAAGTTTGACAAGA - 10560 
-HAYSHTNYHHHSLVSSSLTR 
-MLIHIQTTTITAW*VQV*QD 
CLFTYKLPPSQPGKFKFDKT 
10561 - CTCTTGTGTCAAACCTACACACAATTGCATTGGCTGGGTAACGATCAACGTTACAATTCC - 10620 
-LLCQTYTQLHWLGNDQRYNS 
-SCVKPTHNCIGWVTINVTIP 
LVSNLHTIALAG*RSTLQFQ 
10621 - AAAACAAACAAACACCATCAGTGAATTTATCGTGATGTGTAGCATAAGAATAGAAGAGTT - 10680 
-KTNKHHQ* IYRDV*HKNRRV 
-KQTNTISEFIVMCSIR IEEF 
NKQTPSVNLS*CVA*E*KSS 
10 681 - CCTCTATTTTGTAAGCTTTGTCACTACATGGCTGAGCATCGTAGAACTTCCATTCTACTT - 10740 
-PLFCKLCHYMAEHRRTSILL 
-LYFVSFV TTWLS IVELPFYF 

S I L * ALSLHG*AS * n f h s t s 
10741 - CAGCCTGAGGCACACACTTGATAGCCTTTGGATTTCCAATGTCATGAAGAACTGGAAACT - 10800 
-QPEAHT* * PLDFQCHEELET 
-SIiRHTLDSLWISNVMKNWKL 
A*GTHLIAFGFPMS*RTGNL 
10801 - TATCAGCAAGCAATGCAGACTTCACAACCATGTGTTGTACTTTTCTGCAAGCAGAATTAA - 10860 
-YQQAMQT SQPCVVLFCKQN * 
-ISKQCRLHNHVLYFSASRIN 
SASNADFTTMCCTFLQAELT 
10861 - CCCTCAGTTCATCTCCTATAATAGGGTATTCAACAGACCAATCAACGGGCTTAAGAAAGC - 10920 
-PSVHLL**"GIQQTNQRA*QS 
-PQF1SYNRVFNRPINALNKA 
LSSS PI IGYSTDQSTRLTKH 
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10921 - ACTCATGGACTGCTAAACATCTAGTCATGATAGCATCACAACTAGCCACATGTGCATTTC - 10980 
-THGI»LNI*S**HHN*PHVHF 
-LMDC*TSSHDSITTSHMCIS 
SWTAKHLVMIASQLATCAFP 
10981 - CATGTACCTGGCAATGTTGGTCATGGTTACTCTGAAGGTTACCCGTAAAGCCCCACTGCT - 1104 0 
-HVPGNVGHGYSEGYP* SPTA 
-MYLAMLVMVTLKVTRKAPLL 
CTWQCWSWLL*RLPVKPHC* 
11041 - GAACATCAATCATAAATGGGTTATAGACATAGTCAAAACCCACAGAATGATTCCAGCAGG - 11100 
-EHQS*MGYRHSQNPQNDSSR 
-NINHKWVI DIVKTHRMI PAG 
TSIINGL*T*SKPTE*FQQA 
11101 - CATAAGTATCTGATGAAGTAGAAAAGCAAGTTGCACGTTTGTCACACAGACAACACGTTC - 11160 
-HKYLMK* KSKLHVCHTDNTF 
-ISI**SRKASCTFVTQTTRS 
*VSDEVEKQVARLSHRQHVL 
11161 - TTTCAGGTCCAATCTTGACAAAGTACTTCATTGATGTAAGCTCAAAGCCATGCGCCCAAA - 11220 
-FQVQS* QSTSLM*AQSHAPK 
. - FRSNLDKVLH* CKLKAMRPK 
SGPILTKYFIDVSSKPCAQR 
11221 - GGACGAACACGACTCTGTCTGACAATCCTTTCAGTGTATCACTGAGCATTTGTACTATCT - 11280 
-GRTRLCLT I LSVYH * A F V L S 
-DEHDSV*QSFQCITEHLYYL 
TNTTLSDNPFSVSLSICTIL 
11281 - TAATACGCACTACATTCCAGGGCAAGCCTTTATACATGAGTGGTATAAGATGTTTAAACT - 11340 

- * YALHSRASLYT * V V * D V * T 
-NTHYIPGQAFIHEWYKMFKL 

IRTTFQGKPLYMSGIRCLNW 
11341 - GGTCACCTGGTGGAGGTTTTGCATTAACTCTGGTGAATTCTGTGTTATTTTCAGTGTCAA - 11400 
-GHLVEVLH * LW * ILCYFQCQ 
-VTWWRFCINSGEFCVIFSVN 
SPGGGFALTLVNSVLFSVST 
11401 - CATAACCAGTCGGTACAGCTACTAAGTTAACACCTGTAGAAAATCGTAGCTGGAGAGGTA - 11460 
-HNQSVQLLS * H L * K I LAGEV 
-ITSRYSY*VNTCRKS*LER* 
*PVG?ATKLTPVENPSWRGR 
11461 - GGTTAGTACCCACAGCATCTCTAGTTGCATGACAGCCCTCTACATCAAAGCCAATCCACG - 11520 
-G*YPQHL* LHDSPLHQSQST 
-VSTHSI SSCMTALYIKANPR 
LVPTASLVA*QPSTSKPIHA 
11521 - CACGAACGTGACGAATAGCTTCTTCGCGGGTGATAAACATATTAGGGTAACCATTGACTT - 11580 

- H E R D E *" LLRG * *TY*GNH*L 
-TNVTNS FFAGDKHIRVTI DL 

RT*RIASSRVINILG*PLTW 
11581 - GGTAATTCATTTTGAAACCCATCATAGAGATGAGTCTACGGTAGGTCATGTCCTTTGGTA - 11640 
-GNSF*NPS*R*VYGRSCPLV 
-VIHFETHHRDESTVGHVLWY 
* F I L K P I IEMSLR*VMSFGM 
11641 - TGCCTGGTATGTCAACACATAATCCTTCAGTCTTGAATTTTATATCAACGCTGAGGTGTG - 11700 
-CLVCQHIILQS* ILYQR*GV 
-AWYVNT* S FSLEFYINAEVC 
PGMSTHNPSVLNFISTLRCV 
11701 - TAGGTGCCTGTGTAGGATGAAGACCAGTAATGATCTTACTACAGTCCTTAAAAAGTCCAG - 11760 
-*VPV*DEDQ**SYYSP*KVQ 
-RCLCRMKTSNDLTTVLKKSS 
GACVG*RPVMILLQSLKSPV 
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11761 - TTACATTTTCTGCTTGTAATGTAGCCACATTGCGACGTGGTATTTCTAGACTTGTAAATT - 11820 
-LHFLLVM* phcdvvfldl^i 
-YIFCL*CSHIATWYF*TCKL 
T FSACKfVATLRRGI SRLVNC 
11821 - GCAGTTTGTCATAAAGATCTCTATCAGACATTATGCACAAAATGCCAATTTTTGCCCTTG - 11880 
-AVCHKDLYQTLCTKCQFLPIi 
-QFVIKISI RHYAQNANFCPC 
SLS*RSLSDIMHKMPI FALV 
11881 - TGATAGCCACATTGAAGCGGTTGACATTACAAGAGTGTGCTGTTTCAGTAGTTTGTGTGA - 11940 
_ * * PH*SG*HYKSVLFQ*FV* 
-DSHIEAVDITRVCCFSSLCE 
IATLKRLTLQECAVSVVCVN 
11941 - ATATGACATAGTCATATTCAGAACCCTGTGATGAATCAACAGTCTGCGTAGGCAATCCTA - 12000 
-I * HSHIQNPVMNQQSA*AIL 
-YDIVIFRTL* * INSLRRQS* 
MT^SYSEPCDESTVCVGNPK 
12001 - AGATTTTTGAAGCTACAGCGTTCTGTGAATTATAAGGTGAGATAAAAACAGCTTTTCTCC - 12060 
-RFLKLQRSVNYKVR*KQLFS 
-DF^SYSVL^IIR^DKNSFSP 
I FEATAFCEL*GEIKTAFLQ 
12061 - AAGCAGGATTGCGTGTAAGAAATTCTCTTACAACGCCTATTTGAGGTCTGTTGATTGCAG - 12120 
-KQDCV*EILLQRLFEVC*LQ 
-S.RIACKKF SYNAYLRSVDCR 
AGLRVRNSLTTPI * G L L I A D 
12121 - ATGAAACATCATGTGTAATAACACCTTTGTAGAACATTTTGAAGCATTGAGCTGACTTAT - 12180 
-MKHHV**HLCRTF* SIELTY 

- * NIMCNNTFVEHFEALS*LI 

ETSCVITPL*NILKH*ADLS 
12181 - CCTTGTGTGCTTTTAGCTTATTGTCATAAACTAAAGCACTCACAGTGTCAACAATTTCAG - 12240 
-PCVLLAYCH KLKHSQCQQFQ 
-LVCF*LIVIN* STHSVNNFS 
LCAFSLLS*TKALTVST ISA 
122 41 - CAGGACAACGGCGACAAGTTCCAAGGAACATGTCTGGACCTATTGTTTTCATAAGTCTGC - 12300 
-QDNGDKFQGTCLDLLFS*VC 
-RTTATSSKEHVWTYCFHKSA 
GQRRQVPRNMSGP IVFISLH 
12301 - ACACTGAATTAAAATATTCTGGTTCTAGTGTGCCTTTAGTCAGCAATGTGCGGGGGGCTG - 12360 
-TLN*NILVLVCL*SAMCGGL 
-H * IKIFWF*CAFSQQCAGGW 
TELKYSGSSVPLVSNVRGAG 
12361 - GTAATTGAGCAGGATCGCCAATATAGACGTAGTGTTTTGCACGAAGTCTAGCATTGACAA - 12420 
-V1EQDRQYRRSVLHEV*H*Q 
-*LSRIANIDVVFCTKSSIDN 
N*AGSPI* T*CFARSLALTT 
12 421 - CACTCAAGTCATAATTAGTAGCCATAGAGATTTCATCAAAGACTACAATGTCAGCAGTTG - 124 80 

- H S S HN * * p* RFHQRLQCQQL 
-TQVIISSHRDFIKDYNVSSC 

LKS*LVAIEISSKTTMSAVV 
12481 - TTTCTGGCAATGCATTTACAGTGCAGAAAACATACTGTTCTAGTGTTGAATTCACTTTGA - 12540 
-FLAMHLQCRKHTVLVLNSL* 
-FWQCIYSAENILF*C* IHFE 
SGNAFTVQKTYCSSVEFTLN 
125 41 - ATTTATCAAAACACTCTACGCGCGCACGCGGAGGTATGATTCTACTACATTTATCTATGG - 12600 
-I YQNTLRAHAQV* FYYIYLW 

- FI KTLYARTRRYDSTTFIYG 

LSKHSTRARAGM ILLHLSMG 
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12601 - 



12661 - 



12721 



12781 - 



12841 



12901 



12961 - 



13021 - 



13081 



13141 



13201 - 



13261 - 



13321 - 



13381 



GCAAATATTTTAATGCCTTTTCACATAGGGCATCAACAGCTGCATGAGAGCATGCCGTAT - 12 660 
ANILMPFHIGHQQLHESMPY 
QIF*CLFT*GINSCMRACRI 
KY FNAFSHRASTAA* EHAVY 
ACACTATGCGAGCAGATGGGTAATAGAGAGCAAGTCCGATGGCAAAATGACTCTTACCAG - 12720 
TLCEQMGNREQVRWQNDSY Q 
HYASRWVIESKSDGKMTLTS 
TMRADG**RASPMAK* LLPV 
TACCAGGTGGTCCTTGGAGTGTAGAGTACTTTTGCATGCCGACCTTTTGATAATTTGCAA - 12780 
YQVVLGV*STFACRPFDNLQ 
TRWSLECRVLLHADLLIICN 
PGGPWSVEYFCMPTF* * F A T 
CATTGCTAGAAAACTCATCTGAGATGTTGAGTGTTGGGTACAAGCCAGTAATTCTCACAT - 12 840 
HC*KTHLRC*VLGTSQ*FSH 
IARKLI*DVECWVQASNSHI 
LLENSSEMLSVGYKPVILT* 
AGTGCTCTTGTGGCACTAGAGTAGGTGCACTAAGTGGCATTACAGTGTGAGATGTCAACA - 12900 
SALVALE*VH*VALQCEMST 
VLLWH* SRCTKWHYSVRCQH 
CSCGTRVGALSGITV* DVNT 
CAAAGTAATCACCAACATTCAACTTGTATGTCGTAGTACCTCTGTACACAACAGCATCAC - 129 60 
QSNHQHSTCMS*YLCTQQHH 
KVITNIQLVCRSTSVHNSIT 
K*SPTFNLYVVVPLYTTASP 
CATAGTCACCTTTTTCAAAGGTGTACTCTCCAATCTGTACTTTACTATTTTTAGTTACAC - 13020 
HSHLFQRCTLQSVLYYF*LH 
IVTFFKGVLSNLYFTI FSYT 

* SPFSKV YSPICTLLFLVTR 
GGTAACCAGTAAAGACATAGTTTCTGTTCAATGGTGGTCTAGGTTTTCCAACCTCCCATG - 13080 
GNQ*RRSFCSMVV*VFQPPM 

VTSKDIVSVQWWSRFSNLP* 

* P V K T * FLFNGGLGFPTSHE 
AAAGATGCAATTCTCTGTCAGAGAGTACTTCGCGTACAGTGGCAATACCATATGACAGCT - 13140 
KDAI LCQRVLRVQWQYHMTA 

KMQFSVREYFAYSGNTI*QL 
RCNSLSESTSRTVAI PYDSL 
TAAATGTTTCCTCAGTGGCTTTGAGCGTTTCTGCTGCGAAAAGCTTGAGTCTCTCAGTAC - 13200 
*MFPQWL*AFLLRKA*VSQY 
KCFLSGFERFCCEKLESLST 
NVSSVALSVSAAKSLSLSVQ 
AAGTGTTGGCAAGTATGTAATCGCCAGCATTAGTCCAATCACATGTTGCTATCGCATTGA - 13260 
KCWQVCNRQH* SNHMLLSH* 
SVGKYVIASISPITCCYRIE 
VLASM*SPALVQSHVAIALK 
AGTCAGTGACATTGTCACTGCCTACACATGTGTTTTTGTATAAACCAAAAACCTGACCAT - 13320 
S Q * HCHCLHMCFCINQKPDH 
V S D I VTAYTCVFV^ TKNLTI 
SVTLSLPTHVFLYKPKT* PL 
TAGCACATAATGGAAAACTAATGGGAGGCTTATGTGACTTGCAATAATAGCTCATACCTC - 13380 
*HIMEN*WEAYVTCNNSSYL 
ST*WKTNGRLM*LAI IAHTS 
AHNGKLMGGLCDLQ**LIPP 
CTAGATACAGTTGTGTCACATCAGTGACATCACAACCTGGGGCATTGCAAACATAGGGAT - 13440 
LDTVVSHQ*HHNLGHCKHRD 
*IQLCHISDITTWGIANIGI 
RYSCVTSVTSQPGALQT-GL 
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134 41 - TAACAGACAACACTAATTTGTGTGATGTTGAAATGACATGGTCATAGCAGCACTTGCAAC - 13500 

- * Q T T L I CVMLK*HGHSSTCN 
-NRQH* FV*C*NDMVI AALAT 

tdntnlcdvemtws*qhlqh 
13501 - ataggaatggtctcctaatacaggcaccgcaacgaagtgaagtctgtgaattgcacaata - 13560 
-i gmvs * yrhrnevksvncti 
-*ewspntgtatk*sl* i a q y 

RNGLLIQAPQRSEVCELHNT 
13561 - CACAAGCACCTACAGCCTGCAAGACTGTATGrGGTGTGTACATAGCCTCATAAAACTCAG - 13620 

- H K H LQPARLYVVCT * PHKTQ 
-TSTYSLQDCMWCVHSLIKLR 

QAPTACKTVCGVYIAS * N S G 
13621 - GTTCCCAGTACCGTGAGGTGTTATCATTAGTTAGCATTACGGAATACATGTCCAACATGT - 13680 
-VPSTVRCYH*LALRNTCPTC 

- FPVP*GVIIS*HYGXHVQHV 

SQYREVLSLVSITEYMSNMW 
13681 - GGCCAGTAAGCTCATCATGTAACTTTCTAATGTATTGTAAATACAAGTGAAAGACATCAG - 1374 0 
-GQ*AHHVTF*CIVNTSERHQ 
-ASKLIM*LSNVL* IQVKDIS 
PVSS SCNFLMYCKYK*KTSA 
13741 - CATACTCCTGATTAGGATGTTTTGTAAGTGGGTAAGCATCAATAGCCAGTGACACGAACC - 13800 
~HTPD*DVL*VGKHQ*PVTRT 
-I&I>IRMFCKWVSINSQ*HEP 
YS *LGCFVSG*AS IAS DTNL 
13801 - TTTCAATCATAAGTGTACCATCTGTTTTGACAATATCATCGACAAAACAGCCTGCGCCTA - 13 8 60 
-FQS*VYHLF*QYHRQNSLRL 
-FNHKCTICFDNII DKTACA* 
SIISVPSVLTISSTKQPAPN 
13861 - ATATTCTTGATGGATCTGGGTAAGGCAGGTACACGTAATCATCTCCTTGTTTAACTAGCA - 13920 
-I FLM D LGKAGTRNH LLV * LA 
-YS*WIWVRQVHVIISLFN*H 
ILDGSG*GRYT*SSPCLTSI 
13921 - TTGTATGCTGTGAGCAAAATTCGTGAGGTCCTTTAGTAAGGTCAGTCTCAGTCCAACATT - 13980 
-LYAVSKIREVL* *GQSQSNI 
-CML*AKFVRSFSKVSLSPTF 
VCCEQNS*GPLVRSVSVQHF 
13981 - TTGCCTCAGACATGAACACATTATTTTGATAATAAAGAACTGCCTTAAAGTTCTTAATGC - 1404 0 
-LPQT*THYFDNKELP*SS*C 
-CLRHEHI ILIIKNCLKVLNA 
ASDMNTLF** *RTALKFLML 
14041 - TAGCTACTAAACCTTGAGCCGCATAGTTACTGTTATAGCACACAACGGCATCATCAGAAA - 14100 

- * LLNLEPHSYCYSTQRHHQK 
-SY*TLSRIVTVIAHNGIIRK 

ATKP*AA*LLL*HTTASSER 
14101 - GAATCATCATGGAGAAATGTTTACGCAGGTAAGCGTAAAACTCATCCACGAATTCATGAT - 14160 
-ESSWRNVYAGKRKTHPRIHD 
-NHHGEMFTQVSVKLIHEFMI 
IIMEKCLRR*A*NSSTNS*S 
14161 - CAACATCCCTATTTCTATAGAGACACTCATAGAGCCTGTGTTGTAGATTGCGGACATACT - 14220 
-QHPYFYRDTHRACVVDCGHT 
-NIPISIETLIEPVL*IADIL 
TSLFL*RHS*'SLCCRLRTYL 
14221 - TGTCAGCTATCTTATTACCATCAGTTGAAAGAAGTGCATTTACATTGGCTGTAACAGCTT - 14280 
-CQLSY YHQLKEVHLHWL* QL 
-VSYLITIS*KKCIYIGCNSL 
SAILLPSVERSAFTLAVTA* 
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14281 - GACAAATGTTAAAGACACTATTAGCATAAGCAGTTGTAGCATCACCGGATGATGTTCCAC - 14340 
-DKC * R H Y * H K Q L * HHRMMFH 
-TNVKDTISISSCSITG'-CST 
QMLKTLLA*AVVASPDDVPP 
14341 - CTGGTTTAACATATAGTGAGCCGCCACACATGACCATCTCACTTAATACTTGCGCACACT - 14400 
-LV*HIVSRHT* PSHLILAHT 
-WFNI**AATHDHLT*YLRTL 
GLTYSEPPHMT ISLNTCAHS 
14 4 01 - CGTTAGCTAACCTGTAGAAACGGTGTGATAAGTTACAGCAAGTGTTATGTTTGCGAGCAA - 14 4 60 

- R * LTCRNGVISYSKCYVCEQ 
-VS*PVETV**VTASVMFASK 

LANL *KRCDKLQQVLCLRAR 
144 61 - GAACAAGAGAGGCCATTATCCTAAGCATGTTAGGCATGGCTCTGTCACATTTTGGATAAT - 14520 
-EQERPLS * A C * AWLCH ILDN 
-NKRGHYPKHVRHGSVTFWI I 
TREA I I LSMLGMALSH FG * S 
14521 - CCCAACCCATAAGGTGTGGAGTTTCTACATCACTGTAAACAGTTTTTAACATATTATGCC - 14580 
-PNP* GVEFLHHCKQFLTYYA 
-PTHKVWSFYITVNSF*HIMP 
QPIRCGVSTSL*TVFNILCQ 
14581 - AGCCACCGTAAAACTTGCTTGTTCCAATTACCACAGTAGCTCCTCTAGTGGCGGCTATTG - 14640 
-SHRKTCLFQLPQ*LL*WRLL 
-ATVKLACSNYHSSSSSGGY* 
PP*NLLVPITTVAPLVAA ID 
14641 - ACTTCAATAATTTCTGATGAAACTGTCTATTTGTCATAGTACTACAGATAGAGACACCAG - 14700 

- T S IIS DETVYLS*YYR*RHQ 
-LQ*FLMKLSICHSTTDRDTS 

FNNF* *NC3jFVIVLQIETPA 
14701 - CTACGGTGCGAGCTCTATTCTTTGCACTAATGGCATACTTAAGATTCATTTGAGTTATAG - 14760 
-LRCELYSLH * W H T * D S F E L * 

- Y G A S SILCTNGILKIHLSYS 

TVRALFFALMAYLRFI * V I V 
147 61 - TAGGGATGACATTACGCTTAGTATACGCGAAAAGTGCATCTTGATCCTCATAACTCATTG - 14 820 
-*G*HYA* YTRKVHLDPHNSL 
-RDDITLSIREKCILILITH* 
GMTLRLVYAKSAS * S S * L I E 
14821 - AGTCATAATAAAGTCTAGCCTTACCCCATTTATTAAATGGGAAACCAGCTGATTTATCCA - 14880 
-SHNKV*PYPIY*MGNQLIYP 
-VIIKSSLTPFIKWETS*FIQ 
S * * SLALPHLLNGKPADLSR 
14 881 - GATTGTTAACGATTACTTGGTTGGCATTAATACAGCCACCATCGTAACAATCAAAGTATT - 14940 
-DC*RLLGWH* YSHHRNNQS I 
-IVNDYLVGINTATIVT IKVF 
LLTITWLALIQPPS*QSKYL 
14 941 - TATCAACAACTTCAACTACGAATAGGAGTTGTCTGATATCACACATTGTTGGCAGATTAT - 15000 
-YQQLQLRI G V V * YHTLLADY 
INNFNYE*ELSDITHCWQII 
STTSTTNRSCLISHIVGRL* 
15001 - AACGATAATAGTCATAATCACTGATAGCAGCGTTGCCATCCTGAGCAAAGAAGAAGTGTT - 15060 
-NDNSHNH* *QRCHPEQRRSV 
-TIIVIITDSSVAILSKEEVF 
R**S*SLIAALPS*AKKKCF 
15061 - TTAGTTCAACAGAACTTCCTTCCTTAAAGAAACCTTTAGACACAGCAAAGTCATAAAAGT - 15120 
-LVQQNFLP * R N L * TQQSHKS 
*FNRTSFLKETFRHSKVIKV 
SSTELPSLKKPLDTAKS* KS 
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15121 - CTTTATTAAAATTACCGGGTTTGACAGTTTGAAAAGCAACATTGTTTGTTAGTGCAGCTA - 1518 0 
-LY*NYRV* QFEKQHCLLVQL 
-FIKITGFDSLKSNIVC*CSY 
LLKLPGLTV* KATLFVSAAT 

15181 - CTGAAAAGCATGTAGTGCGTTTATCTAGCAATAAATTGCCAGAAGCTGCATGCATAGCTG - 15240 
-LKSM*CVYLAINCQKLHA*L 
-*KACSAFI*Q*IARSCMHSW 
EKHVVRLSSNKLPEAACIAG 

15241 - GATCAGCAGCATACACTAAAAGTTCCTTGAAACTGAGACGCGAGCTATGTAAGTTTACAT - 15300 
-DQQHTLKVP*N*DASYVSLH 

- ISSIH*KFLETBTRAM*VYI 

SAAYTKS SLKLRRELCKFTS 
15301 - CCTGATTATGTACGACTCCTAACTCACGAAAATGGTATCCAGTTGAAACAACAAAAGGAA - 15360 

- P DYVRLLTHENGIQLKQQKE 
~I>IMYDS*LTKMVSS*NNKRN 

* LCTTPNSRKWYPVETTKGT 
15361 - CACCATCTACAAATATTTTTCTTACTAGTGGTCCAAAACTTGTAGGTGGAAACACAGTAG - 1542 0 
-HHLQI FFLLVVQNL * V E T Q * 
-TIYKYFSY^WSKTCRWKHSR 
PSTN IFLTSGPKIiVGGNTVE 
15421 - AAAATAAGACATTAAAGTTTGCACAATGAAGGATACACCTATCATCCAAACAGTTAATAC - 15480 
-KI T H * SLHNEGYTYHPNS * Y 
"K*HIKVCTMKDTPIIQTVNT 
NNTLKFAQ*RIHLSSKQLIQ 
15481 - AATTGGGATGGTATGTCTGGTCCCAATATTTAAAATAACGGTCGAAGAGACAAAGTCTCT - 15540 

- N W DGMSGPNI *NNGRRDKVS 

IGMVCLVP IFKITVEETKSL 
LGWYVWSQYLK* RSKRQSLS 
15541 - CTTCCGTAAAATCATATTTCAGCAAATCCCACTTAATAAGTGGTTTTGCGAGATCAGCAT - 15600 
-LP*NHISANPT**VVLRDQH 
-FRKIIFQQIPLNKWFCEISI 
SVKSYFSKSHLI SGFARSAS 
15 601 - CCATATGGGACTCAGCAGCCAATGCCCTAGTCAAAGTGAGGATGGGCATCAGCAATGAGT - 15660 
-PYGTQQPMP* SK*GWASAMS 
-HMGLSSQCPSQSEDGHQQ*V 
IW DSAA NALVKVRMG I SNE * 
15661 - AATATGAATCCACAATAGGAACTCCGCAGCCTGGTGCTACTTGTACGAAATCACCGAAAT - 15720 
-NMNPQ*ELRSLVLLVRNHRN 
-I*IHNRNSAAWCYLYEITEI 
YESTIGTPQPGATCTKSPKS 
15721 - CGTACCAGTTCCCATTAAGATCCTGATTATCTAATGTCAGTACGCCTACAATGCCTGCAT - 157 80 
-RTS SH * DP DYLMSVRLQCLH 

- VPVPIKILII*CQYAYNACI 

YQFPLRS *LSNVSTPTMPAS 
15781 - CACGCATAGCATCGCAGAATTGTACAGTCTTTAATAATGATTGGCGTACACGCTCACCTA - 15840 
-HA* HRRIVQSLIMIGVHAHL 
-THSIAELYSL* **LAYTLT* 
RIASQNCTVFNNDWRTRSPK 
15841 - AGTTAGCATATACGCGTAAGATGTCAGGATTCTCTACGAAGTCATACGAATCCTTCTTAT - 15900 
-S*HIRVRCQDSLRSHTNPSY 
-VSIYA*DVRILYEVIPILLI 
LAYTRKMSGFSTKSYQSFLL 
15901 - TGAAATAATCATCATCACAGCAATTGTATGTGACGAGTATTTCTTTTAATGTATCACAAT - 15960 

- * NNHHHSNCM * RVFLLMYHN 
-EIIIITAXVCDEYFF*CITI 

K*SSSQQLYVTSISFNVSQL 
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15961 - TACCCTCATCAAAATGACGTAGAGCATAGACTAAATCAGCCATTGTGTATTTAGTTAGAC - 16020 
-YPHQNDVEHRLNQPLCI*LD 
-TLIKMT*SID*ISHCVFS*T 
PSSK*RRA*TKSAIVYLVRR 

16021 - GCTGACGTGATATATGTGGTACCATGTCACCATCTACTCTAAACTTGAAAAAGTCATGGA - 16080 

- A D V I YVVPCHHLL* T * KSHG 

LT * YMWYHVTIYSKLEKVMD 
* R D I CGTMSPSTLNLKKSWT 
16081 - CAGCAACCGCTGGACAATCTTTAACCAAGTTATAAATAGTCTCTTCATGTTGGTAGTTAG - 1614 0 
-QQPLDNL* P S Y K * S LHVGS* 
SNRWTIFNQVINSLFMLVVR 
ATAGQSLTKL* IVSSCW*LD 
16141 - ACATAGTATGCCTCTTAACTACAAAGTAAGAGTCTAATAAATTGCCTTCCTCATCCTTCT - 16200 
-T*YAS*LQSKSLINCLPHPS 
-BSMPLNYKVRV* * IAFLILL 
IVCLLTTK*ESNKLPSSSFS 
16201 - CCTGGAAGCGACAGCAATTAGTTTTTAGGAACTTTGCAAAACCAGCACTTTTTTCGTTGT - 16260 
-PGS DSN* FLGTLQNQH FFRC 
-LEATAISF*ELCKTSTFFVV 
WKRQQLVFRNFAKPALFSL* 
16261 - AAATATCAAAAGCCCTGTAGACGACATCAGTACTAGTGCCTGTGCCGCACGGTGTAAGAC - 16320 
-KYQKPCRRHQY* CLCRTV * D 
-NIKSPVDDISTSACAARCKT 
ISKAL*TTSVLVPVPHGVRR 
16321 - GGGCTGCACTTACACCGCAAACCCGTTTAAAAACGTTGATGCATCCGCAGACTGCATCAA - 1638 0 
-GLHLHRKPV* K R * C IRRLHQ 

- GCTYTAN PFKNVDASADCIK 

AALT PQTRLKTLMHPQTASR 
16381 - GGGTTCGCGGAGTTGGTCACAACTACAGCCATAACCTTTCCACATTCCGCAGACGGTACA - 16440 
-GFAELVTTTAI TFPHSADGT 
-GSRSWSQLQP*PFHIPQTVQ 
VRGVGHNYSHNLSTFRRRYR 
16441 - GACTGTGTTTCTAAGTGTAAAACCCACTGGGTCATTAGCACAAGTGGTAGGTATTTGGAC - 16500 
-DCVS KCKTHWVI ST SGRYLD 
TVFLSVKPTGSLAQVVGIWT 
LCF*V*NPLGH*HKW*VFGR 
16501 - GTACTTACCTTTCAAGTCACAGAATCCTTTAGGATTTGGATGGTCAATGTGGCATCTACA - 16560 
-VLTFQVTESFRIWMVNVAST 

- YLPFKSQNPLGFGWSMWHLQ 

TYLS SH'RIL*DLDGQCGIYN 
16561 - ATACAGACAACATGAAGCACGACCAAAGGACTCTTGGTCCATGTTAGCTTCTGGTGTTAC - 1662 0 

- I Q T T * STTKGLLVHVS FWCY 

- YRQHEAPPKDSWSMLASGVT 

TDNMKHHQRTLGPC*LLVLQ 
16621 - AGTAATTGCCTGTCCTGTACCAGTGTGTGTACACAACATCTTCACACAGTTGGTGATTGG - 1668 0 
-SNCLSCTSVCTQHLHTVGDW 
-VIACPVPVCVHNIFTQLVIG 
*LPVLYQCVYTTSSHSW*LV 
16681 - TTGTCCTCCACTTGCTAGGTAATCCTTATATGCTTTAGCAGGGTCTACTGCAAAAGCACA - 1674 0 
-LSSTC*VILICFSRVYCKST 
-CPPLAR^SLYALAGSTAKAQ 
VLHLLGNPYML* QGLLQKHR 
16741 - GAAGGAAAGCACAGTTGAATTGGCAGGTACTTCTGTAGCATTTCCAGCCTGAAGACGTAC - 16800 
-EGKHS*IGRYFCSI SSLKTY 
-KESTVELAGTSVAFPA*RRT 
RKAQLNWQVLL* HFQPEDVL 
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16801 - TGTAGCAGCTAAACTGCCCAGCACCATACCTCTATTTAGGTTGTTTAAGCCTTTGATGAA - 16860 
-CSS*TAQHHTSI*VV*AFDE 
-VAAKLPSTIPLFRLFKPLMK 

* QLNCPAPYLYLGCLSL* * S 

16861 - GTACAAGTATTTCACTTTAGGCCCTTTTGGTGTGTCTGTAACAAACCTACAAGGTGGTTC - 16920 
-VQVFHFRPFWCVCNKP TRW F 
-YKYFTLGPFGVSVTNLQGGS 
TSISL*ALLVCL*QTYKVVP 

16921 - CAGTTCTGTGTAAATTGTACCTGTACCATCACTCTTAGGGAATCTAGCCCATTTGAGATC - 16980 
-QFCVNCTCTITLRESS PFEI 

- S S V * IVPVPSLLGNLAHLRS 

VLCKLYLYHHS* GI*PI*DL 
16981 - TTGGTGGTCTGATAGTAATGCCAGCACAAACCTACCTCCCTTCGAATTGTTATAGTAGGC - 17040 

- L V V * * *CQHKPTSLRIVIVG 
-WWSDSNASTNLPPFELL**A 

GGLIVMPAQTYLPSNCYSRQ 
17041 - AAGTGCATTGTCATCAGTACAAGCTGTTTGTGTGGTACCAGCCGCACAGGACATCTGTCG - 17100 
-KCIVI STSCLCGTSRTGHLS 
-SALS SVQAVCVVPAAQDICR 
VHCHQYKLFVWYQPHRTSVV 
17101 - TAGTGCTACTGGACTCAGTTCATTATTCTGTAGTTTAACAGCTGAGTTGGCTCTTAGAGC - 17160 
-*CYWTQFIIL*FNS*VGS*S 

- SATGLSSLFCSLTAELALRA 

VLLD SVHYSVV* QLSWLLEL 
17161 - TGTAACAATAAGAGGCCAAGCCAAATTTGGTGAATTGTCCATGTTAATTTGACTAAGTTG - 17220 
-CNNKRPSQIW* IVHVNFTKL 
-VTIRGQAKFGELSMLI S L S * 
*Q*EAKPNLVNCPC*FH*VE 
17221 - AACAATCTTGCTATCCGCATCAACAACTTGCTGGATTTCCCAGAGTGCAGATGCATATGT - 17280 
-NNLAIRINNLLDFPECRCIC 
-TILLSASTTCWISQSADAYV 
QSCY PHQQLAGFPRVQMHiXI* 
17281 - AAAGGTGTTACCATCACAAGTGTTCTTGTAGGTACCATAATCAGGGACAACAACCATGAG - 17340 
-KGVT I TSVLVGT I I RDNNHE 
-KVLPSQVFL*VP*SGTTTMS 
RCYHHKCSCRYHNQGQQP*V 
17341 - TTTGGCTGCTGTAGTCAATGGTATGATGTTGAGTGGAACACAACCATCACGCGCATTGTT - 174 00 

- F G C C S QWYDVEWNTT I TRIV 
-LAAVVNGMMLSGTQPSRALL 

WLL*SMV*C*VEHNHHAHC* 
17401 - GATAATGTTGTTAAGTGCATCATTATCAAGCTTCCTAAGCATAGTGAAGAGCATTGTTTG - 17 4 60 
-DNVVKCI I IKLPKHSEEHCL 
-IMLLSASLSSFLSIVKSIVC 

* CC*VHHYQAS*A**RALFA 

17461 - CATAGCACTAGTTACTTTTGCCCTCTTGTCCTCAGATCTTGCCTGTTTGTACATTTGGGT - 17520 
-HSTSY FCPLVLRS CLFVHLG 
IALVTFALLSSDLACLYIWV 
*H*LLLPSCPQILPVCTFGS 
17521 - CATAGCCTGATCTGCCATCTTTTCCAACTTGCGTTGCATGGCAGCATCACGGTCAAACTC - 17580 

- H S L I CHLFQLALHGS I TVKL 
-IA*SAIFSNLRCMAASRSNS 

* PDLPSFPTCVAWQHHGQTQ 

17581 - AGATTTAGCCACATTCAAAGATTTCTTTAACTTTTTGAGAACGACTTCAGAATCACCATT - 17 64 0 
-RPSHIQRFL*LPENDFRITI 

- DLAT FKDFFNFLRTTSESPL 

I *PHSKISLTF*ERLQNHH* 
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17641 - AGCTACAGCCTGCTCATAGGCCTCCTGGGCAGTGGCATAAGCGGCATATGATGGTAAAGA - 17700 
-SYSLLIGLLGSGISGI* W*R 

- A T A C S *ASWAVA*AAY DGKE 

LQPAHRPPGQWHKRHMMVKN 
17701 - ACTAAATTCTGAAGCAATAGCCTGAAGAGTAGCACGGTTATCGAGCATTTCCTCGCACAA - 177 60 

- T K F * SNS LKSSTV IEHFLAQ 
~LNSEAIA*RVARLSS I SSHN 

* I L K Q * PEE* HGYRAFPRTT 
177 61 - CCTATTAATGTCTACAGCACCCTGCATGGATAGCAAAACAGACAAAAGAGAAACCATCTT - 17820 
-PINVYSTLHG*QNRQKRNHL 
-LLMSTAPCMDSKTDKRETIF 
Y*CLQHPAWIAKQTKEKPSS 
17821 - CTCGAAAGCTTCAGTTGTGTCTTTTGCAAGAAGAATATCATTGTGGAGTTGTACACATTG - 17880 
~L5S FSCVFCKKNI IVEL.YTL 
-SKASVVSFARRISLWSCTHC 
RKLQLCLLQEEYHCGVVHIV 
17 881 - TGCCCACAATTTAGAAGATGACTCTACTCTAAGTTGTTGAAGAACCGAGAGCAGTACCAC - 17 940 
-CPQFRR* LYSKLLKNREQYH 

- AHNLEDDSTLSC*RTESSTT 

PTI *KMTLL*VVEEPRAVPQ 

17 941 - AGATGTGCACTTTACGTGAGACATTTTAGACTGTACAGTAGCAACCTTGATACATGGTTT - 18000 

-RCALYVRHFRLYSSNLDTWF 
-DVHFTSDILDCTVATLIHGL 
MCTLRQTF* TVQ*QP*YMVY 

18 001 - ACCTCCAATACCCAACAACTTAATGTTAAGCTTGAAAGCATCAATACTACTCTTAGGAGG - 18060 

-TSNTQQLNVKLES INTTLRR 
-PPIPNNLMLSLKASILLLGG 
LQYPTT*C*A*KHQYYS*EA 
18061 - CAAAAGCCCCTGGGAGTTCATATACCTAAATTCTTGTGTAGAGACCAAGTAGTCATAAAC - 18120 
-QKPLGVHI PKFLCRDQVVIN 

- K S P WEFI YLNSCVETK* S * T 

KAPGSSYT*ILV*RPSSHKH 
18121 - ACCAAGAGTAAGCCTGAAGTAACGGTTGAGTAAACAGAAAAGGCCAAAGTAGCAGCAGCA - 18180 
-TKSKPEVTVE * TEKAKVAAA 
P R V S L K * RLSKQKRPK*QQQ 
Q E * A * SNG*VNRKGQS SSSN 
18181 - ACAATAGCCTAAGAAACAATAAACAAGCATGATACACTGTAAGGTGTTGCCAGTAATAAA - 18240 

- T I A * ETINKHDTL*GVASNK 
-Q*PKKQ*TSMIHCKVLPVIN 

NSLRNNKQA*YTVRCCQ* * I 
18241 - TAACAATGGGTAATACTCAACACACACAAACACTATAGCTCTAGCTAAAAACATGATAGT - 18300 
~*QWVILNTHKHYSSS*KHDS 
-NNG* YSTHTNTIALAKNMIV 
TMGNTQHTQTL*L*LKT**.S 
18301 - CGTAACGACACCAGAATAGTTAGAGGTTACAGAAATAACTAAGGCCCACATGGAAATAGC - 18360 
-RNDTRIVRGYRNN*GPHGNS 
-VTTPE*LEVTEITKAHMEIA 
*RHQNS *RLQK*LRPTWK*L 
18361 ~ TTGATCTAAAGCATTACCATAGTAGACTTTGTAAACAAGTGTAATGACATTCATCAGTGT - 18420 
-LI*SITIVDFVNKCNDIHQC 
-*SKALP* *TL*TSVMTFISV 
DLKHYHSRLCKQV* * HSSVS 
18421 - CCAAACACGTCTAGCAGCATCATCATAAACAGTGCGAGCTGTCATGAGAATAAGCAAAAC - 184 80 
-PNTSSSIIINSASCHENKQN 
-QTRLAAS S * TVRAVMR I SKT 
KHV * QHHHKQCELS* E * A K L 
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18 481 - TAAAGCTGAAGCATACATAACACAATCCTTAAGCCTATAACCAGACAAGCTAGTGTCAGC - 18540 
-*S*SIHNTILKPITRQASVS 
-KAEAYITQSLSIi* PDKLVSA 
KLKHT*HNP*AYHQTS*CQP 
18541 - CAATTCAAGCCATGTCATGATACGCATCACeCAGCTAGCAGGCATGTAGACCATATTAAA - 18600 
-QFKPCHDTHHPASRHVDHIK 
-NSSHVMIRITQLAGM*TILK 
IQAMS*YASPS*QACRPY*S 
18601 - GTAAGCAACTGTTGCAAGAGAAGGTAACAGAAACAAGCACAAGAATGCGTGCTTATGCTT - 18660 
-VSNCCKRR * QKQAQECV LML 
- *ATVAREGNRNKHKNACLCL 
KQLLQEKVTETSTRMRAYA* 
18 661 - AACAAGCAGCATAGCACATGCAGCAATTGCCATAATACCAAGAGTAAATGGCAAGAAAGC - 18720 
-NKQHSTCSNCHNTKSKWQES 
-TSSIAKAAIAI I PRVNGKKA 
QAA*HMQQLP*YQE*MARKH 
18721 - ATTCTCGTAAACA2UIGAAAAACAGTGACCACTGTGTACTTTGAACAAGAATCAATAGTGA - 18780 
-I LVNKEKQ* PLCTLNKNQ* * 
~FS*TKKNSDHCVL*TRINSD 
SRKQRKTVTTVYFEQES IVM 
18 781 - TGTCAAGAAAGTTAAAAGCATCCAATGATGAGTGCCCTTAACAATTTTCTTGAACTTACC - 18840 
-CQES*KHPMMSALNNFLELT 
-VKKVKSIQ* *VPLTIFLNLP 
SRKLKASNDECP*QFS*TYL 
18 841 - TTGGAAGGTAACACCAGAGCATTGTCTAACAACATCAAATGGTGTAAACTCATCTTCTAA - 18900 
-LEGNTRALSNNIKWCKL I F* 
-WKVTPEHCLTTSNGVNSSSK 
GR*HQSIV*QHQMV*THLLK 
18 901 - AATAGTGCTACCAAGGATAGTACGACCATTCATACCATTCTGCAGCAGCTCTTTCAAAGC - 18960 
-NSATKDSTTI HT I LQQLFQS 
-IVLPRIVRPFIPFCSSSFKA 
* CYQG* YDHSYHSAAALSKQ 
18 9 61 - AGCACACATATCTAAGACGGCAATTCCTGTTTGAGCAGAAAGAGGTCCCAATATGTCAAC - 19020 
-STH I * DGN SCLSRKRSQYVN 
-AHISKTAI PV*AERGPNMST 
HTYLRRQFLFEQKEVPICQH 
19021 - ATGATCTTGTGTCAAAGGTTCATAGTTGTACTTCATTGCCACAAGGTTAAAGTCATTCAA - 19080 
-MILCQRFIVVLHCHKVKVIQ 
*SCVKGS*IjYFIATRLKSFK 
DLVSKVHSCTSLPQG*SHSK 
190 81 - AGTAGTGGTGAATCTATTAAGAAACCACCTATCACCATTGATAACAGCAGCATA'CAGCCA - 1914 0 
-SSGESIKKPPITIDNSSIQP 
-VVVNLLRNHLSPLITAAYSH 
±W*IY*ETTYHH**QQHTAM 
19141 - TGCCAAAACATTTAATGTTATGGTTGTGTCTGTACCTGCAGCCTGTGCAGTTTGTCTGTC - 19200 
-CQNI*CYGCVCTCSLCSLSV 
-AKTFNVMVVSVPAACAVCLS 
PKHLMLWLCLYLQPVQFVCQ 
19201 - AACAAATGGACCATAGAATTTACCTTCTAAGTCAGTACCAGCGTGTACTCCTGTTGGAAG - 19260 
-NKWTIEFTF*VSTSVYSCWK 
-TNGP*NLPSKSVPACTPVGS 
QMDHRIYLLSQYQRVLLLEA 
19261 - CTCCATATGATGCATATAGCAGAAAGACACGCAATCATAATCAATGTTAAAACCAACACT - 19320 
-LHMMHI AERHAI I INVKTNT 
-SI*CI*QKDTQS*SMLKPTL 
PYDAYSRKTRNHNQC* NQHY 
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19321 - ACCACATGATCCATTAAGGAAAGAACCTTTAATGGTATGATTAGGTCTCATGGCACACTG - 19380 
~TT* S I KERTFNGMI RSHGTL 
-PHDPLRKEPLMV"* L G L M A H * 
HMIH*GKNL*WYD*VSWHTD 

19381 - ATAAACACCAGATGGTGAACCATTGTAGCATGCTAGAACTGAAAATGTTTGACCAGGTTG - 19440 
-INTRW*TIVAC*N* KCLTRL 

- * TPDGEPL*HARTENV*PGW 

KHQMVNHCSMLELKMFDQVG 
19441 - GATACGGACAAATTTATACTTGGGTGTCTTAGGGTTAGAAGTATCAACTTTAAGCCTAAG - 19500 
-DTDKFILGCLRVRS INFKPK 
-IRTNLYLGVLGLEVSTLSIiS 
YGQIYTWVS*G*KYQL*A*A 
19501 - CAGACAATTTTGCATAGAATGGCCAATAACACGAAGTTGAACATTGCCAGCCTGAACAAG - 19560 

- Q T ILHRMANNTKLN I A S L N K 
-RQFCIEWPITRS*TLPA*TR 

DNFA*NGQ*HEVEBCQPEQE 
195 61 - AAAGCTATGGTTGGATTTGCGAATGAGCAGATCTTCATAGTTAGGATTAAGCATGTCTTC - 19620 
-KAMVGFANEQI FIVRI KHVF 
-KLWLDLRMSRSS *hGLSMSS 
SYGWICE*ADLHS*D*ACLL 
19621 - TGCTGTGCAAATGACATGTCTTGGACAGTATACTGTGTCATCCAACCACAATCCATTAAG - 19680 
-CCANDMSWTVYCVIQPQSIK 
-AVQMTCLGQYTVSSNHNPLR 
LCK*HVLDSILCHPTTIH*E 
19681 - AGTTGTAGTTCCACAGGTTACTTGTACCATGCACCCTTCAACTTTGCCTGACGGGAATGC - 19740 
-SCSSTGYLYHAPFNFA*REC 
-VVVPQVTCTMHPSTLPDGNA 
L * FHRLLVPCTLQLCLTGMP 
19741 - CATTTTCCTAAAACCACTCTGCAGAACAGCAGAAGTGATTGATGTCTGTGGTGGTTGGTA - 19800 
-HFPKTTLQNSRS D * CLWWLV 

- IFLKPLCRTAEVIDVCGGW* 

FS *NHSAEQQK* LMSVVVGR 
19801 - GAGAACATCAGGACCTGAGTTGCTAAAGTCATTTAGAGCCTTTGCTAAGTGGCAGCAAGC - 19860 
-ENIST*VAKVI*SLC*VAAS 
-RrSAPELLKSFRAFAKWQQA 
EHQHLSC* SHLEPLLSGSKL 
19861 - TGCTTCACGATAGCTGGTAGTATCTAAGGCTCCACTGAAATACTTGTACTTGTTATATAG - 19920 
-CFTIAGSI*GSTEILVLVI* 
~ASR*LVVSKAPLKYLYLLYR 
LHDSW*YLRLH*NTCTCYIE 
19921 - AGCAAGATACCTGTTATACTGTGTAAGTGGCAACAGTGTCTCGCTACGCAATTTTAGGTA - 19980 
-SKIPVILCKWQQCLATQ-F*V 
-ARYLIjYCVSGNSVSLRNFRY 
QDTCYTV^VATVSRYAILGT 
19981 - CATTTCCTTGTTGAGCAAAAAGGTACACAAAGCAGCCTCCrCGAAGGTACTAAATGTAAC - 20040 

- H FLVEQKGTQS SLLEGTKCN 
-ISLLSKKVHKAASSKVLNVT 

FPC*AKRYTKQPPRRY*M*L 
20041 - TCCATTAAACATGACTCTTTTCCTAAGATAGTTGTTAAAGAACCAATGGCAGTGCTTCAG - 20100 
*-S I KHDSFPKIVVKE PMAVLQ 

- PLNMTLFLR* LLKNQWQCFR 

H*T*LFS*DSC*RTNGSASE 
2 0101 - AGAAATACAGAATACATAGATTGCTGTTATCCAAAAAGGCACAATAGGAGAAAACATGGC - 20160 
-RNTEY I DCCYPKRHNRRKHG 
EIQNT^IAVIQKGTIGENMA 
KYRI HRLLLSKKAQ*EKTWQ 
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20161 - AAACCATTGAAGGTGAGCCAAGAATGAAACATCATTGGTGAAATAGAATGTCAAGTACAA - 20220 
-KPLKVSQE*N I I G B I E C Q V Q 
-NH*R*AKNETSLVK*NVKYK 
TIEGEPRMKHHW*NRMSSTS 
20221 - GTAAAAGACTGAGTAGACTCCCGGCAGAAAGCTGTAAGCTGGTACCAGACAGAGTATAGT - 20280 
-VKD*VDSRQKAVSWYQTEYS 
-*KTE*TPGRKL*AGTRQSIV 
KRLSRLPAESCKLVPDRV** 
20281 - GAAAGACATCAAAAACAAAAGTGCATTAGCAGCAACAACATGGTTGTACTCACCAAAAAC - 20340 

- E RHQKQKCI S SNNMVVLTKN 
-KDIKNKSALAATTWLYSPKT 

KTSKTKVH*QQQHGCTHQKH 
20341 - ACGTCTGAATTTCATAAAGTAGTAGGCAGCACAAGTCACCAATATGGCAATAATAGCACC - 20400 

- T S E FHKVVGSTS HQYGNNTT 
~RLNFIK**AAQVTNMAIIPP 

V*IS*SSRQHKSPIWQ*YHQ 
20401 - AGCCACTACTGAAGCAGACACATCTAAAGCACCCACAGGTTGCACAAGAGGAGTAAAGAT - 204 60 
-SHY *SRHI*STHRLHKRSKD 
-ATTEADTSKAPTGCTRGVKM 
PLLKQTHLKHPQVAQEE* RC 
20461 - GTTAGCTATGAGATTCATCGCATCAACACCACAGAAAACTCCTGATAGAGCTCTGTAATG ~ 20520 
~VSYEIHRINTTENS**SSVM 
~LAMRFIASTPQKTPDRAL*C 
* L * DSSHQHHRKLLIELCNA 
20521 - CTCATTATTAAGAACCCATCTACCACTGGTAGATAGGCAAATACCTACTTCTGACCTTTC - 20580 

- L I I KNPSTTGR*ANTYF*PF 
-SLLRTHLPLVDRQIPTSDLS 

HY*EPIYHW* IGKYLLLTFR 
20581 - GCATGTACCATGTCTAGAGTACTCAGCATCAAAAGTTGTTACTACTGTAACAGAACCCTC - 20640 
-ACTMS TVLSIKSCYYSNRTL 
-HVPCLQYSASKVVTTLTEPS 
MYHVYSTQHQKLLLL*QNPP 
20641 - CAGGTAAGTGTTAGGAAACTGTATGATGGAACCATCCATAAGCACATAACGAGTGTCTGG - 20700 
-QVSVRKLYDGT IHKHITSVW 

- R * VLGNCMMEPSIST*RVSG 

GKC*ETV*WNHP*AHNECLD 
20701 - ACGAAGCTCACTATAAGAAATAGAACCCTCTAGCAAATTAGTGTCATAACAATATGGCAC - 20760 
-TKLTIRiS)RTL*QISVITIWH 
-RSSL*EIEPSSKLVS*QYGT 
EAHYKK*NPLAN * CHNNMAQ 
207 61 - AGGTTTGCCCATAGCATCCTTAAAAATTGTACACTCAGCAGCAAGAACGCAAGCAGAGGT - 20820 

- R F A H S ILKNCTLSSKNASRG 

GLPIASLKIVHSAARTQAEV 
V C P * HP*KLYTQQQERKQR* 
20821 - AGCAAAATCACTATACTCAATGAGTTTGGAAGGTGTGTAGCAAATGTTGCCAACAGCACT - 20880 
-SKITILNEFGRCVANVANST 
-AKSLYSMSLEGV±QMLPTAL 
QNHYTQ*VWKVCSKCCQQH* 
20881 - AAAAACACGAGGTAGAAAATCCAAGAAGTCACCATTGATTGCTCTCAGCACAGTACCCGG ~ 20940 
-KNTR* KMQEVTI DCSQHSTR 
-KTRGRKCKKSPLIALSTVPG 
KHEVENARSHH* LLSAQYPV 
20941 - TAAGCCAGGCACTATGAAACCAATCTCTCTTGTAATGATAGCAGCTACTACAGGGCAGCT - 21000 

- * ARHYETNLSCNDSSYYRAA 
-KPGTMKPISLVMIAATTGQL 

SQAL*NQSLL* * *QLI>QGSF 
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21001 - TTTGTCATTTTTGTATGAACCACCACGCTGGCTAAACCATGCGTCAAAACCAGCATGTTT - 21060 

- F V I FV * TTTLAKPCVKTSMF 
-LSFLYEPPRWLNHASKPACL 

CHFCMNHHAG* TMRQNQHVY 
21061 - ATTTGCAAAACAATCATCAGTAGAAATGATGTCACGAGTGACACCATCCTGAATGGCTTT - 21120 
-ICKT I I SRNDVTSDTILNGF 
-FAKQSSVEMMSRVTPS * M A L 
LQNNHQ*K*CflE*HHPBWLC 
21121 - GTAACCAATGATTTCATTTGTGTAACCATCATGGATTGACAATGTATGTACTGGCATAAC - 21180 
-VTNDFICVTIMD*QCMYWHN 

- * PMISFV*PSWIDNVCTGIT 

NQ*FHLCNHHGLTMYVLA*R 
21181 - GATATAACAAACCAATGCAGCAAGAACGCACAATAATGTGGCCTTAAGCATAAGTTTAAA - 21240 

- D I TNQCSKNAQ*CGLKHKFK 

I ^QTNAARTHNNVALS ISLK 
YNKPMQQERTIMWP*A*V*N 
21241 - ACAAGTACTAACAATCTTACCACCCTTGAGTGAGATTTTAGTAGTTATGACATTGACAAC - 21300 
-TSTNNLTTLE"^ DFS SYDIDN 
-QVLTILPPLSEILVVMTLTT 
KY*QSYHP*VRF**L*H*QP 
21301 - CTGTCTAGTTGTAGCACAAGTTAGTGTAAAAGGTATGTTGTTCTTCTTGGCAGGAGTACG - 21360 
-LSSCSTS*CKRYVVLLGSST 
-CLVVAQVSVKGMLFFLAAVR 
V*L*HKLV*KVCCSSWQQYE 
21361 - AATT TGTT T ACGCAGCTGT TCAG ATAAAGAC ATGTAGTCT TTTACATTCCAGATGAGTGA - 21420 

- N L FTQLFR*RHVVFYI PDE* 

ICLRSC5DKDM*SFTFQMSE 
FVYAAVQIKTCSLLHSR*VK 
21421 - AACATTGTGACTTTTTGCTACTTGGGCATTGATATGCCTTGCATTACAGTCAATACATGC - 21480 
-NIVTFCYLGI DMPC ITVNTC 
~TL*LFATWALICLALQSIHA 
HCDFLLLGH*YALHYSQYMR 
21481 - GCCAAGATCTCTGGGCGTCATGTTTTCAACCTTATTATAGGTGAGCATGAAATTGTTACA - 21540 

- A K I S GRHVFtSTLI IGEHB IVT 

PRSLGVMFSTLL^VSMKLLQ 
QDLWASCFQPYYR*A*NCYN 
21541 - ACTGTCACCTGTCACTTCTAAGTCAGAGTGATGTGAAAGTTTGAGACATTCAATAACATC - 21600 
~TVTCHF*VRVM*KFETFNNI 
-LSPVTSKSE*CESLRHSITS 
CHLSLLSQSDVKV*DIQ*HP 
21601 - CTTTGTGTCAACATCGGTATCAACAACACCTTGTCGGGCAGCTGACACGAATGTAGAAAG - 21660 
-LCVNIGINNTLSGS * H E C R K 
-FVSTSVSTTPCRAADTNVER 
LCQHRYQQHLVGQLTRM* KG 
21661 - GACACCATCTAAAGCTACACCCTTTGCTAACTCGCTGTGAGCTGTAGCAACAAGTGCCTT - 21720 
-DTI*SYTLC*LAVSCSNKCL 
-TPSKATPFANSL*AVATSAL 
HHLKLHPLLTRCEL*QQVP* 
21721 - AAGTTTTTCCATAGGAACACTAAAAGTTGCTGAAAAGGTGTCGACATAAGCATCAAACAT - 21780 
-KFFHRNTKSC* KGVDI SI KH 

- S F S IGTLKVAEKVST*ASNI 

VFP* EH*KLLKRCRHKHQTS 
217 81 - CTTAACGGAAACTTCAGTACTATCTCCAACGTTTGATACAAGAGCTTGGTCAAGCAACAG - 2184 0 
-LNGWFSTISNV *YKSLVKQQ 
-LTETSVLSPTFDTRAWSSNR 
* RKLQYYLQRLIQELGQATE 
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21841 - AATAGGTTGGCACATCAGCTGACTGTAGTACACAGAAGCAGACTTAGAAGCAGACTCGTC - 21900 
-NRLAHQLTVVHRSRLRSRLV 

- IGWHIS*L*YTEADLEADSS 

* VGTSADCSTQKQT* KQTRR 
21901 - GCATTTGGACTTGCCATCAAAAACTATGACATTAATAGGCAGTGAACCTTTAGTGTTGTT - 21960 
-AFGLAIKNYDINRQ*TFSVV 
-HLDLPSKTMTLIGSEPLVLL 
IWTCHQKL*H**AVNL*CC* 
21961 - AGCTCTCAAATTGTCTAAATTGACAAAATGGGAGAGCGGATGTCTCTCATAGGTCTTTTG - 22020 
-SSQIV*IDKMGERMSLIGLL 
-ALKLSKLTKWESGCIiS*VP* 
IiSNCLN*QNGRADVSHRSFD 
22021 - ACCAGCCTTGTCAAAGTAGAGGTGAAGCGCGCCATTTTTCACAGCAACACTATCAACAAT - 220 BO 
-T SLVKVEV KRAI FHSNT INN 
-PALSK*R^SAPFFTATLSTI 
QPCQSRGEARHFSQQHYQQY 
2 2081 - ATACGATGACTGGTCAGTAGGGTTGATTGGTCTTTTAAACTGGAGTGACAAATCACGAGC - 22140 
-IR*JiVSRVDWSPKLB + QITS 
~ YDDWSVGLIGLLNWSDKSRA 
TMTGQ* G * L V F * TGVTNHEQ 
22141 - AACTTCATCACTAATGAATGTACTACCAGTGCAAAATGTGTCACAATTGAGACAATTCCA - 22200 

- N FITNECT TSAKCVTI ETIP 
-TSSLMNVLPVQNVSQLRQFQ 

3j H H * *MYYQCKMCHN* DNSN 
22201 - ATTGTGAGTCTTGCAGAAGCCACGGCCTCCATTTGCATAGACATAGAAAGATCTCTTCAT - 22260 
-IVSLAEATAS ICI DIERS LH 
-L*VLQKPRPPPA*T*KDLPM 
CESCRSHGLHLHRHRKISSC 
22261 - GCCATTAACAATAGTTGTACACTCAACGCGTGTGGCACGATTGCGCTTATAGCACATCAT - 22320 
-AINNSCTLNACGTIALIAHH 
-PLTIVVKSTRVARLRL*HIM 
H * Q * LYTQRVWHDCAYSTSC 
22321 - GCAAGTCGAAGAGGTGCAACCATCCATGATATGAACATAGCTCTTCCATATGTAGTAGAA - 22380 
-ASRRGATI HDMNIALPYVVE 
-QVEEVQPSMI *T*LFHM* * K 
KSKRCNHP*YEHSSSICSRK 
22381 - AGAAGCAAAGAAGATGTACATCCTAACCATTGCAGAAACGGGTGCCATTTGTACAATACT - 22440 

- R S K E DVB PNHCRNGCHLYNT 
-EAKKMYILTIAETGAICTIL 

KQRRCTS *PLQKRVPFVQY* 
22441 - AATGATAAACCACATGAGCCAAGAATTGCTGATGAAATGACTAGCAAAATAGCCAAAGAA - 22500 
-NDKPHEPRIADEMTSKIAKE 
-MINHMSQELLMK* LAK* PKN 
**TT*AKNC**ND*QNSQRT 
22501 - CACCTGCATTATAGCTGAAAGACCTAATAAATAAAAGAATTTTGTGAACAACATATATGC - 22560 

- H L H Y S * k T * * IKEFCEQHIC 
-TCI IAERPNK*KNFVNNIYA 

PAL*LKDLINKRIL*TTYMP 
22 561 - CAAAACCCACTCAGCGGCCAGACCTAAAATTGTCAAGTCTAGCTTGTACGATGAAATCGT - 22 620 
-QNPLSGQT*NCQV*LVR*NR 
-KTHSAARPKIVKSSLYDEIV 
KPTQRPDLKLSSLACTMKSS 
22 621 - CACCTGAATGGTTTCAAGAGCTGGATAAGAATCAAGGGAGTCTAATCCACTTAAACAAAT - 22680 
-HLNG FKSW I RIKGV * S T * T N 
-T*MVSRAG*ESRESNPLKQM 
PEWFQELDKNQGSLIHLNKC 
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22 681 - GCTGCAAGGAAAAGAACCTTCACAGAAATCCATAGTAGTAACGTTAGACGAATTAAGATA - 227 40 
-AARKRTFTEIHS SNVRRIKI 

- LQGKEPSQKS IVVTLDELRY 

CKEKNLHRNP***R*TN*DT 
22741 - CAATTCTCTAACGCCATTACAATAAGAAGGAGCACCAAAATTAGATAAGAGTACACCAAA - 22800 
-QFSNAIT I RRSTKIR*EYTK 
-NSLTPLQ*EGAPKLDKSTPK 
IL*RHYNKKEHQN*IRVHQK 
22 801 - AGCAGCAGTTACACAGATTAGAGAACCTAAGCAAATACTTAACAACAATAGCCACATAGC - 228 60 
-SSSYTD*RT*ANT*QQ* PHS 
-AAVTQIREPKQILNNNSHIA 
QQLHRLENLSKYLTTIAT*R 
22 8 61 - GATTGTGAACAATTTAGAAAATTTGGGTGACTTCACATAATTAATGCCGGCATCCAAACA ~ 2 2920 
-DCEQFRKFG* LHI INAGIQT 

- IVNNLENLGDFT *LMPASKH 

L*TI*KIWVTSHN*CRHPNI 
22921 - TAATTTAGCAACACTCTTAACACTATTTTTAGCAATAGTTGTAGGTAGTGAAGCTCTAAT - 22980 

- * FSNTLNTIFSNSCR**SSN 
-NLATLLTLFLAIVVGSEALI 

I*QHS*HYF*Q*L*VVKL*F 
22981 - TCTAGAATTGGTACTTTTAGTAAAAGTACACAATTGGAACAATAATGTAAACACATAAGG - 23040 
-SRI GT FSKSTQLEQ * CKHIR 
LELVLLVKVHNWNNNVNT*G 

* N W Y F * *KYTIGTIM*THKA 

23041 - CATATAATTGTTAAACACACGTTGTGCTAATCTCTTAGCGCAATTTGATGTTGTAATTGC - 23100 
-HIIVKHTLC*SLSAI*CCNC 

- I*LLNTRCANLLAQFDVVIA 

Y N C * THVVLIS*RNLML*LL 
23101 - TGCTTGTCCTAAGAATGGTTTGACATAAGCCAAAATTTTACTCCAAGGAACACTATTAAT - 23160 

- C L S * EWFDISQNFTPRNT IN 
-ACPKNGLT*AKILLQGTLLI 

LVLRMV*HKPKFYSKEBY*L 
23161 - TGCAGCAATACCATGAGTGGCAATTGTTTTTAAACCTAAGGCTAGTGAAAGCTCATTAGG - 23220 
-CSNTMSGNCF*T*G* * K L I R 
~AAIP*VAIVFKPKASESSLG 
QQYHEWQLFLNLRLVKAH * V 
23221 - TTTCTTAATGGTAATGCTTGTGTTTTCCACATAAGCAGCCATAAGATCCTCATGACCTAA - 23280 
-FLNGNACVFHI SSHKILMT* 
--FLMVMLVFST*AAIRSS* PN 
S*W*CLCFPHKQP*DPHDLT 
2 3281 - CTCTTGTGTTACTTTAACACCTTCATCTGATGGTTTAAGTATGACATTGCCTACAACTTC - 23340 
~LLCYFNTFI*WFKYDIAYNF 
-SCVTLTPSSDGLSMTLPTTS 
LVLL*HLHLMV*V* HCLQLR 
2 3341 - GGTAGTTTTCACGTCACACTCTATGACTTCCTTCTGTATGGTAGGATTTTCCACTACTTC - 23400 
~GSFHVTLY DFLLYGRI FHYF 
-VVFTSHSMTSFCMVGFSTTS 

* FSRHTL*LPSVW*DFPLLL 

23401 - TTCAGAGGTGGGTTGTTGACTTTCACAAGCAAGATTGTGCATTCCTTGTGTGTCTTCTAC - 23460 
-FRGGLLTFTSKIVHSLCVFY 
~SEVGC*LSQARLSXPCVSST 
QRWVV DFHKQDCPFLVCLLL 
234 61 - TGCCAGAACTTCAAATGAATTTGAAGTATCTACTGGCTTTGTACTCCAAAGACAACGTAA - 23520 
~ C Q N F K ^ I * SIYWLCTPKTT* 
-ARTSNEFEVSTGFVLQRQRK 
PELQMNLKYLLALYSKDNVN 
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23521 - ACACCAAGTGTTTGGTTTGAACGTTGTCTTGGTTGTAGCCTGGTTAATGTGCCAAACAAT - 23580 
-TPSVWFERCLGCSLVNVPNN 
-HQVFGLNVVLVVAWLMCQTI 
TKCLV*TLSWL*PG*CAKQL 
23581 - TGGCTTATGCAGTAATTTAGCACCTTTCTTGAAACTCGCTGAATAGTGTCTATAGTCAAT - 23640 
-WLMQ* FSTFLETR* IVSIVN 
~GLCSNLAPFLKLAE*CL*SI 
AYAVI*HLS*NSLNSVYSQ* 
23641 - AGCCACTACATCGCCATTCAAGTCTGGGAAGAATGTGACAGATAGCTCTCGTGAAGCTGG - 23700 
-SfiYIAIQVWEECDR*LS*SW 
-ATTSPFKSGKNVTDSSREAG 
PLHRHSSLGRM*QIALVKLA 
23701 - CTTTGTGAAGCCTGTCATTTGATTTAAATCATCAGCAAATTTTGTGTTAGAACATGTGAG - 23760 
-LCEACHLI * I I SKFCVRTCE 

- FVK PV I * FKS SANFVLEHVS 

L * SLSFDLNHQQILC*NM*V 
23761 - TTTGAAATTATCAAAACTCGCATTTGGTAATGGTTGAGTTGGTACAAGGTCTATAGGCTG - 23820 

- F E I IKTRIW*WLSWYKVYRL 
-LKIiSKLAFGNG*VGTRS igc 

*NYQNSHLVMVELVQGL*AA 
23821 - CTCTGTATAGTAAGCATTATCCTTTTTATAATACCCATCCAATTTTGGTTCAATCTCTGT - 23880 
-LCIVSIILFIIPIQFWFNLC 
-SV**ALSFL*YPSNFGSISV 
LYSKHYPFYNTHPILVQSLC 
23881 - GTAAGTAACTCCATCGAGTTTATACGACACAGGCTTGATGGTTGTAGTGTAAGATGTTTC - 23940 
-VSNS I EFIRHRL DGCSVRCF 
-*VTPSSLYDTGI>MVVV*DVS 
K*LHRVYTTQA*WL*CKMFP 
23941 - CTTGTAGAAAACATCAGTCACTGGTCCTTTGTACTCTGACATCTTTGTAAGGTGAGCTCC - 24000 
-LVENISHWSFVL*HLCKVSS 
-L*KTSVTGPLYSDIFVR*AP 
CRKHQSLVLCTLTSL*GELR 
24001 - GTCAATACGATAGAGGGTCTCCTTAGCAGTTATATGAGTGTAATGACCACACTGATAGTT - 24060 
-VNTIEGLLSSYMSVMT TLIV 
-SIR*RVSLAVI*V**PH**L 
QYDRGSP*QLYECNDHTDSY 
2 4 061 - ACCAGTGTACTCATTCGCACATAAGAATGTACCTTGCTGTAATTTATACTCAGCAGGTGG - 24120 
-TSVLIRT*ECTLL* FILSRW 
PVYS FAHKNVPCCNLYSAGG 
QCTHSHIRMYLAVIYTQQVV 
24121 - TGCAGACATCATAACAAAAGAAGACTCTTGTTGTACTAGATATTGTGTAGCATCACGACC - 24180 
-CRHHNKRRLLLY * I LCS ITT 
-ADIITKEDSCCTRYCVASRP 
QTS * QKKTLVVLDIV*HHDH 
24181 - ACACACACATGGAATGGAAACACCTGTCTTAAGATTATCATAAGATAGAGTACCCATATA - 2 4240 
-THTWWGNTCLKI IIR*STHI 

- HTHGMETPVLRLS* DRVPIY 

THMEWKHLS* DYHKIEYPYT 
24241 - CATCACAGCTTCTACACCCGTTAAGGTAGTAGTTTTCTGACCACAATGTTTACACACCAC - 24300 
-HHS FYTR* GS SFLTTMFTHH 

- ITASTPVKVVVF*PQCLHTT 

SQLLHPLR** FSDHNVYTPH 
24 301 - ATTAAGAACTCGCTTTGCAGATTCCAAATTAGCATGCTGTAGAAGATGGGTCATAGTTTC - 24360 

- I KNSLCRFQI S M L * KMGHSF 
-LRTRFADSKLACCRRWVIVS 

*ELALQIPN*HAVEDGS*FL 
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24 3 61 - TCTGACATCACCAAGCTCGCCAACAGTTTTATTACTGTAAGCGAGTATGAGTGCACAAAA - 24 420 

- S D I TKLANSFITVSEYECTK 
-LTSPSSPTVLLL*ASMSAQK 

* HHQARQQFYYCKRV* VHKS 
24 421 - GTTAGCAGCATCACCAGCACGGGCTCTATAATAAGCCTCTTGAAGTGCTGGTGCATTGAA - 24 4 80 
-VSS ITSTGSI ISLLKCWCIE 

- L A A S P A R A L * * A S * SAGALN 

*QHHQHGLYNKPLEVLVH*I 
24481 - TTTGACTTCAAGCTGTTGAAGTGCTAATAAAACACTAGACAAATAACAATTGTTATCAGC - 24540 
-FDFKLLKC* *NTRQITIVIS 

- L T S S C * SANKTLDK*QLLSA 

*IjQAVEVLIKH*TNNNCYQP 
24541 - CCATTTAATTGAAGTTAAACCACCAACTTGAGGAAATTTCCATTTCTTTGTGTGGTTTAA - 24600 

- P F N * S *TTNLRKFPFLCVV* 
-HLIEVKPPT*GNFHFFVWFK 

J * LKLNHQLEEIS ISLCGLK 
24601 - AGCAGACATGTACCTACCAAGAAAACTCTCATCAAGAGTATGGTAGTACTCGAAAGCTTC - 24660 
-SRHVPTKKTLI KSMVVLESF 
-ADMYLPRKLSSRVW*YSKAS 
QTCTYQENSHQEYGSTRKLH 
24 661 - ACTACGTAGTGTGTCATCACTAGGTAGTACAAAGAAAGTCTTACCCTCATGATTTACATG - 2 4 720 

- T T * CVITR*YKESLTLMI YM 
-LRSVSSLGSTKKVLPS * F T * 

YVVCHH*VVQRKSYPHDLHE 
24721 - AGGTTTAATTTTTGTAACATCAGCACCATCCAAGTATGTTGGACCAAACTGCTGTCCATA - 24780 
-RFNFCNISTIQVCWTKLLS-I 
^GLIFVTSAPSKYVGPNCCPY 
V* Fli*HQHHPSMLDQTAVHM 
2 4781 - TGTCATAGACATATCCACAAGCTGTGTGTGGAGATTAGTGTTGTCCACAGTTGTGAACAC - 24840 
-CHRHIHKLCVEISVVHSCEH 
-VI DISTSCVWRLVLSTVVNT 
S*TYPQAVCGD*CCPQL*TL 
24841 - TTTTATAGTCTTAACCTCCCGCAGGGATAAGAGACTCTTTAGTTTGTCAAGTGAAAGAAC - 24900 
-FYS LN LPQG* E T L * F V K * K N 
FIVLTSRRDKRLFSLSSERT 
L * S * PPAGIRDSLVCQVKEP 
24901 ~ CTCACCGTCAAGATGAAACTCGACGGGGCTCTCCAGAGTGTGGTACACAATTTTGTCACC - 24960 
-LTVKMKLDGALQSVVHNFVT 
~SPSR*NSTGLSRVWYTILSP 
HRQDETRRGSPECGTQFCHH 
24961 - ACGCTTAAGAAATTCAACACCTAACTCTGTACGCTGTCCTGAATAGGACCAATCTCTGTA - 25020 
-TLKKFNT*LCTIiS * IGPISV 
-RLRNSTPNSVRCPE * D Q S L * 
A* EI QHLTLYAVLNRTNLCK 
25021 - AGAGCCAGCCAAAGAAACTGTTTCTACAAAGTGCTCCTCAGATGTCTTTGATGACGAAGT - 25080 
-RASQRNCFYKVLLRCL* * R S 
-EPAKETVSTKCSSDVFDDEV 
SQPKKLFLQSAPQMSLMTK* 
25081 - GAGGTATCCATTATATGTAGTAACAGCATCTGGTGATGATACTGACACTACGGCAGGAGC - 25140 
-EVS!ICSNSIW**Y*HYGRS 
-RYPLYVVTASGDDTDTTAGA 
G I HYM* * Q H L V M I LTLRQEL 
25141 - TTTAAGAGAACGCATACAGCGCGCAGCCTCTTCAAGATTAAAACCATGTGTCACATAACC - 25200 
-FKRTHTARSLFKIKTMCHIT 
LRERIQRAASSRLKPCVT* P 
*ENAYSAQPLQD*NHVSHNQ 
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25201 - AATTGGCATTGTGACAAGCGGCTCATTTAGAGAGTTCAGCTTCGTAATAATAGAAGCTAC - 25260 
-NWHC DKRLI *RVQLRNNRSY 
-IGIVT SGSFREFSFVI IEAT 
L A L * QAAHLESSAS** * K L Q 

25261 - AGGCTCTTTACTAGTATAAAAGAAGAATCGGACACCATAGTCAACGATGCCCTCTTGAAT - 25320 

- R L F T S IKEESDTIVNDALLN 
-GSLLV*KKNRTP*STMPS*I 

ALY*YKRRIGHHS<2RCPLEF 
25321 - TTTAATTCCTTTATACTTACGTTGGATGGTTGCCATTATGGCTCTAACATCCATGCATAT - 25380 
-PNSFI LTLDGCHYGSNIHAY 
-LIPLYLRWMVAIMALTSMHI 

* FLYTYVGWLPLWL^HPCI* 

25381 - AGGCATTAATTTTCTTGTCTCTTCAGCATGAGCAAGCATTTCTCTCAAATTCCAGGATAC - 25440 
-RH*FSCLFSMSKHFSQIPGY 
-GINFLVSSA*ASISLKFQDT 
ALIFLSLQHEQAFLSNSRIQ 
25441 - AGTTCCTAGAATCTCTTCCTTAGCATTAGGTGCTTCTGAAGGTAGTACATAAAATGCAGA - 25500 
-SS*NLFLSIRCF*R*YIKCR 
~-VPRISSLALGASEGST*NAD 
FLES LP*H*VLLKVVHKMQI 
25501 - TTTGCATTTCTTAAGAGCAGTCTTAGCTTCCTCAAGTGTATAACCAGCACATCCTTGTCC - 25560 
-FAFLKSSLSFLKCI TS TSLS 
-LHFLRAVLASSSV*PAHPCP 
CIS^EQS^LPQVYNQHILVQ 
25561 - AGGGTACGTGGTTATATACTCATCAACTGGCACTTTCTTCAAAGCTCTTGAGAGCATCTC - 25620 
~ R V R G Y ILINWHFLQSS*EHL 

- G Y V V I YSSTGTFFKALESIS 

GTWLYTHQLALSSKLLRASQ 
25621 - AGTAGTGCCACCAGCCTTTTTGGAGGGTATTACAACACAAGTGATATCACCACTAGTGAT - 25680 
-SSATSLFGGYYNTSDITTSD 
-VVPPAFLEGITTQVISPLVI 
*CHQPFWRVLQHK*YHH*** 
25681 - AACATCACCTACCATGTAAGGTGCATCCTTCTCAAGGAAAGACATATCTTCACCTCTAAG - 25740 

- N I TYHVRCILLKERH I FTSK 
-TSPTM*GASFSRKDISSPLS 

HHLPCKVHPSQGKTYI>HL*A 
25741 - CATGTTCTGAGAATCATGGTAAAGCTTACCATTGATATCAGCAAACAAGAGTAACTTATT - 25800 
-HVLRI MVKLTIDI S K Q E * LI 

- MF*ESW*SLPLI SANKSNLL 

CSENHGKAYH* YQQTRVTYW 
25801 - GGTAAGAAACTTAGTTTCTTCCAGTGTTGTGGTAACCTCATCAATGCAGGCCTTAATTTT - 258 60 
-GKKLS FFQCCGNLINAGLNF 
-VRNLVSSSVVVTSSMQALIF 
*ET*FLPVLW*PHQCRP*FL 
25861 - TGGCTTCACATCGACAGGCTTCTGTACGACAGATTTCTCCTCAGTTTTGGAATCTTCTGT - 25920 
-WLHIDRLLY DRFLXiSFGIFC 
-GFTSTGFCTTDFSSVLESSV 
ASHRQASVRQISPQFWNLLC 
25921 - GTTTGGTGGCTCCTCTTGTTTAGGTGCTTCCACTCTAGGCTTCAGGTTATCAAGATAATC - 25 980 
-VWWLLLFRCFHSRLQVIKXI 
-FGGSSCLGASTLGFRLSR*S 
LVAPLV*VLPL*ASGYQDNP 
25981 - CATGACAACCTGCTCATAAAGAGCTTTGTCATTGACTGCAATATAAACCTGTGTACGAAC - 26040 

- H DNLL IKSFVI DCN IN LCTN 
-MTTCS*RALSLTAI *TCVRT 

* QPAHKELCH* LQYKPVYEP 
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2 6041 - CGTCTGCACGCACACTTGTAAAGACTGAAGTGGTTTAGCACCAAATATGCCTGCTGACAA - 2 6100 
-RLHAHL*RLKWFSTKYAC*Q 

- VCTHTCKD* SGLAPNMPADN 

SARTLVKTEVV*HQICLLTT 
26101 - CAATGGTGCAAGTAAGATGTCCTGTGAATTGAAATTTTCATATGCTGCCTTAAGAAGCTG - 26160 

- Q W C K * DVL*IEIFICCLKKL 
-NGASKMS CELKFSYAALRSW 

MVQVRCPVN*NFHMLP*EAG 
26161 - GATGTCCTCACCTGCATTTAGGTTAGGTCCAACAACATGCAGACACTTCTTAGCAAGATT - 26220 
-DVLTC I *VRSNNMQTLLSKI 
-MSSPAFRLGPTTCRHFLARL 
CPHLHLG*VQQHADTS*QDY 
26221 - ATGTCCAGAAAGCAAACAAGACCCTCCTACTGTAAGAGGGCCATTTAGCTTAATGTAATC - 26280 
-MSRKQTRPSYCKRAI * L N V I 
-CPESKQDPPTVRGPFSLM*S 
VQKANKTLLL* E G H L A * CNH 
26281 - ATCACTCTCCTTTTGCATGGCACCATTGGTTGCCTTGTTGAGTGCACCTGCTACACCACC - 26340 
-ITLLLHGTIGCLVECTCYTT 
-SLSFCMAPLVALLSAPATPP 
HSPFAWHJiWLPC*VHLLHHH 
26341 - ACCATGTTTCAGGTGTATGTTAGCAGCATTTACAATCACCATAGGATTAGCACTTTGTGC - 26400 
-TMFQVYV S S IYNHHRI STLC 
-PCFRCMLAAFTITIGLALCA 
HVSGVC*QHLQSP*D*HFVP 
26401 - CTCCTTAACGATGTCAACACATTTAATGGCAACATTGTCAGTAAGTTTTAAATAACCAGT - 26460 
-LLNDVNTFNGNIVSKF* ITS 
SLTMS THLMATLSVSFK* PV 
P*RCQHI*WQHCQ*VLNKfQ* 
26461 - AAACTGATTAACTGGTTCTTCAGGTGTAGGTTCTGGTTCTGGCTCAATCTCTGATTGCTC - 26520 
-KLINW FFRCRFWFWLNL* h L 
-N*LTGSSGVGSGSGSISDCS 
TD*LVLQV*VLVLAQSLIAQ 
26521 - AGTAGTATCATCCAGCCAGTCTTCCTCTTCTTCTTCCTCAACTCGAACTGTTTCAGCTGA - 26580 
-SSIXQPVFLFFFLNSNCFS* 
-VVSSSQSSSSSSSTRTVSAB 
* YHPASLPLLLPQLELFQLR 
2 6581 - GGCACCAAATTCGAGAGGGAGACCTTGATAATCATCCTCTGTACCGTACTCATGTTCACA - 2 6640 
-GTKFQRETLI I ILCTVLMFT 
~APNSRGRP**SSSVPYSCSQ 
HQIPEGDLDNHPLYRTHVHR 
26641 - GGTTTCATCAATTTCTTCTTCCTCACACTCTGCATCGTCCTCTTCTTGCTCATCTGGAGG - 26700 

- G F I N FFFLTLCIVLFFLI WR 
-VSSISSSSHSASSSSSSSGG 

FHQFLLPHTLHRPLLPHIiEG 
2 67 01 - GTAAAAGGAACAATACATACGTGATGAAAAGTTTTCTTCACCAGCATCATCAAATAAGTA - 26760 
-VKGTIHT**KVFFTSIIK*V 
-*KEQYIRDEKFSSPASSNK* 
KRNNTYVMKSFLHQHHQISR 
26761 - GAATGTAGCTACACTCCACTCATCAAGATCAATACCCATGTTGGTAAGGAGATCAGAAAC - 26820 
-ECSYT PLIKINTHVGKEIRN 
-NVATLHS SRSI PMLVRRSET 
M*LH5THQDQYPCW*GDQKL 
2 6821 - TGGTTGTAAAGTCTTCACAACAGCCTCTGCTACAACACATGCAAACTCAGTAACTTCGGT - 2 68 BO 

- W Ii * S L HNSLCYNTCKLSNFG 
-GCKVFTTASATTHANSVTSV 

VVKS SQQPLLQHMQTQ*LRY 
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26881 ~ ACCGGATTCAACAGTGTAGACAGAGCACTTTTCATTAAGCACTTTGTCAACACGTTCAXC - 26940 
-TGFNSVDRALFIKHFVNTFI 
~PDSTV*TEHFSLSTLSTRSS 
RIQQCRQSTFH*ALCQHVHQ 

26941 - AAGCTCAAATGTGATTCTCACATTCTTGTAACCTTGAACTTCCCAAACAGTATCTTCTCC - 27000 
-KLKCDSH I L V T L N F P N S I FS 
-SSNVIL?FL*P*TSQTVSSP 
AQM*FSHSCNLELPKQYLLQ 

27001 - AAAGGTTACACCTTTAATTGGTGCACCCCCTTTTAAGCGAAAGACATTGTTTGTAGCCAG - 27060 
-KGYTFNWCTPF*AKDIVCSQ 

- KVTPLIGAPPFKRKTLFVAS 

R L H L ** LVHPLLSERHCL*PV 
27061 - TAAACCAGGAGACAATGCGCAGTATTGTTCTTTGTCCTTAATCTCTAAGAGCATGAGGCC - 27120 

- * TRRQCAVLFFVLNL*EHEA 
~KPGDNAQYCSLSLI SKSMRP 

NQETMRSIVLC1?*SLRA*GH 
27121 - ATTTACACAGACTGGTGTGCCGACGATAGCTCCATTTGTGAAGCTATCAACGGGCGTCTC - 27180 
-IYTDWCADDSSICEAINGRL 
-FTQTGVPTIAPFVKLSTGVS 
LHRLVCRR*LHL*SYQRASR 
27181 - GAGTGCTTCGAGTTCACCGTTCTTGAGAACAACCTCCTCAGAGGTAAGTACTGTGTCATG - 27240 
-ECFEFTVLENNLLRGKYCVM 
-SASSSPFLRTTSSEVSTVSC 
VLRVHRS*EQPPQR*VLCHV 
27241 - TGAATCACCTTCAAGAAAGGTTACTTCTTTTGGTGCCTTAAGAGGCATGAGTAGTTGCAG - 27300 

- * IT FKKGYFFWCLKRHE * LQ 
-ESPSRKVTSFGALRGMSSCS 

NHLQERLLLLVP*EA*VVAA 
27301 - CTGCTCCTTGCCACGTATACACTGACGGTAAAGTCCCTTGCTTTGAGCGATGAAGACTTC - 27360 
-LLLATYTLTVKSLALSDEDF 

- CSLPRIH*R*SPI<L*AMKTS 

APCHVYTDGKVPCFER*RLH 
27361 - ACCTAAGTTGAGTGATCGGAACTTTGCGCCAGCGATAGTGACTTGATCAATGCACATTTC - 27420 
~T*VE* SQLCASDSDLXNAHF 
-PKLSDRNFAPAIVT*SMHIS 
LS*VIATLRQR**LDQCTFR 
27421 ~ GAGTGCCTTGTTAACAACATCAATGAAGCATTTTACACAATCCTTGATGTTATCTGAAGC - 27480 
-ECLVNN INEAFYTILDVI * S 
SALLTTSMKHFTQ5LMLSEA 
VPC*QHQ*SILHNP*CYLKQ 
27481 - AACCTGTATTTGACCCTTGACGATGTCAAAAACACCTGTAATGAGAAATTTGAGAATCTC - 27540 
-NLYLTLDDVKNTCNEKFENL 
-TCI* PLTMSKTPVMRNLRIS 
PVFDP*RCQKHL**EI*ESP 
27541 - CCAAGCATCCTTGAGAAATTCAACTCCTGCACTAAGTTTCGCCTCAATCCATTCAAAGAT - 27600 
-PSILEKFNSCTKFRLNPFKD 
-QASLRNSTPALSFASI HSKI 
KHP*EIQLLH*VSPQSIQR* 
27 601 - AGGCCTGAGTTTTTCAACAGTAGTGCCCAAAAGATTAGACAACCACTGAGAAGTCTGTTG - 27 660 
-RPEFFNS SAQKIRQPLRSLL 
-GLSFSTVVPKRLDNH* EVCC 
A*VFQQ*CPKD*TTTEKSVV 
27 661 - TACAAGACCACCAGTTACATATGCCATAATAATGACACTGTTGGTGAGCAGGTCTGAAGT - 27720 
-Y KTTS YICHNNDTVGEQV* S 
-TRPPVTYAI IMTLLVSRSEV 
QDHQLHMP***HCW*AGLKY 
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27721 - ATAAACCATGGCGTCGACAAGACGTAATGACTGTTCAGAAATACCATCAAGTATGGTGAC - 27780 
-I NHGVDKT * * LFRNT IKYGD 
-*TMASTRRNDCSEIPSSMVT 
KPWRRQDVMTVQKYHQVW*Q 
27781 - AGCTGCTCTTTGCAAATCAGGAATTGAGTGGTTTGCTGCATCAAGTGTGCGCGCAAAAAT - 278 40 
- S CSLQ IRN*VVCCIKCARKN 
-AALCKSGI EWFAASSVRAKI 
LLFANQELSGLLHQVCAQKL 
27 841 - TGATCTGATAACACCAGCAGCCTGTGAGGGAAAACCACACAGTGGTGTTAAAACTGATCT - 27900 
-*SDNTSSL*GKTTQWC*N*S 
DLITPAACEGKPHSGVKTDL 
I * *HQQPVRENHTVVLKLIS 

27 901 - CTGTTGTCCAATGTTCCAAGCACCTTTTACGGGCTTTCCCTTGGTAACTTTATAGTTACC - 27960 

-LLSNVPST FYGLSLGNFI VT 
-CCPMFQAPFTGFPLVTL*IjP 
VVQCSKHLLRAFPW*LYSYR 
27961 - GCAGGACTCAACAATGGTTTTGAAAGACTTGTAATCAAGACTCTTTATAGTGTCAATAAA - 28020 
-AGLNNGFERLVI KTLYSVNK 
-QDSTMVLKDL*SRLFIVSIK 
RTQQWF*KTCNQDSL*CQ*R 
28021 - GGCACTTGTAGAAGCAGAGAAAGATGCCAAAATGATGGCAACCTCTTCATTCAAATGAAA - 28080 
-GTCRSRERCQNDGNLFIQMK 
-ALVEAEKDAKMMATSSFK* K 
HL*KQRKMPK*WQPLHSNEN 
2 6 081 - ATCGCCAACAATGTTAATGTTAACACGTTCACGACTCAGTATCTCAAGGAGATCCTCATT - 28140 
-IANNVNVNTFTTQYLKEILI 
-SPTMLMLTRSRLSISRRSSF 
RQQC*C*HVHDSVSQGDPHS 
28141 - CAAGGTCTCCACATTGTCACCAGTAATGCCAGTATGGCCTGAGCCAATATCAGCACTAGC - 28200 
-QGLHIVTSNASMA*ANISTS 
-KVSTLSPVMPVWPEPISALA 
RSPHCHQ* CQYGLSQYQH*H 
28201 - ACGAGGAACCCAGTAGGCACGCTTATTATAGCAGCCAACATAGGCAAACACACAGCCTCC - 28260 
-TRNPVGTLI IAANIGKHTAS 
-RGTQ*ARLL*QPT*ANTQPP 
EEPSRHAYYSSQHRQTHSLQ 
28261 - AAAACATCTAGTCCTACCTCCCTTGCGGAGTCGAGTTTCAATGTTTGAGTGGTTGTGATA - 2 8320 
-KTSSPTSLAESSFNV*VVVI 
-KHLVLPPLRSRVSMFEWL** 
NI*SYLPCGVEFQCLSGCDN 
28321 - ATCTGCAACACTATGCTCAGGTCCAATCTCTGGGTCTTGACAGGCAGGACATGGCATTTT - 28380 
-ICNTMLRSNLWVLTGRTWHF 
-SATLCSGPI SGS*QAGHGIF 
LQHYAQVQSLGLDRQDMAFS 

28 381 - CACTACAGCATTAGTAGGTAGGTACCCACATGTAGTAGGTCCTTCAATAACTAAATTTTC - 2 8440 

-HYSISR*VPTCSRSFNN*IF 
-TTALVGRYPHVVGPSITKFS 
LQH**VGTHM**VLQ*LNFQ 
28441 - AGTGCCACAATGTTCACAAGTGGCTTTCAGAAAGTCGCACGTCTGCCATGAAACTTCATC - 28500 
-SATMFTSG FQKVARLP * N F I 
-VPQCSQVAFRKSHVCHETSS 
CHNVHKWL5ESRTSAMKLHR 
28501 - GCAATGATTACATTTCATCAAGGTAGACAAGTGCATATTGTTACACTCCTGTGGAGATGC - 28560 
-AMI TFHQGRQVH IVTLLWRC 
-Q*LHFIKVDKCILLHSCGDA 
NDYISSR* TSAYCYTPVEMQ 
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28561 - AACAGGGTACACAGAGCGTATACGCCCCATGAAACCCTCAGTCTTTTTCTTTTCAACACG - 28620 
-NRVHRAYTPHETLSLFLFNT 
-TGYTERIRPMKPSVFFFSTR 
QGTQSVYAP*NPQSFSFQHV 
28 621 - TGGTTGAATGACTTTGACTTTTGAGTTAAGAGGAAACACAAACTTTGGGCATTCCCCTTT - 28680 
-WLNDFDF*VKRKHKLWAFPF 
-G*MTLTFELRGNTNFGHSPL 
VE*L*LLS*EETQTLGIPL* 
28 681 - GAAAGTGTCAAATTTCTTGGCACTCTTAATTTCGAAGGGTGTCTGGTGCTCGTAGCTCTT - 28740 
-ESVKFLGTLNFEGCLVLVAL 
-KVSNFLALLISKGVWCS* LL 
KCQISWHS*FRRVSGARSSY 
28741 - ATCAGAGCGCTCAGTGAACCAGGCAATTTCATGCTCATGGTCACGGCAGCAGTAGACACC - 28800 
-I RALSEPGNFMLMVTAAVDT 
-SERSVNQAISCSWSRQQ*TP 
QSAQ*TRQFHAHGHGSSRHL 
28 801 - TCTCTTCGACTCGATGTAATCAAGTTGTTCGGAAAGAGTGCACATTGACTTGCCCGCGCG - 28860 
-S LRLDVIKLFGKSAH*LARA 
-LFDSM*SSCSERVH XDLPAR 
SSTRCNQVVRKECTLTCPRV 
28 861 - TGCGAGAAAATCTTTGATGCAATCAAGAGGGTACCCATCTGGGCCACAGAAATTGTTGTC - 28920 
-CEKIFDAIKRVP IWATEIVV 

- ARKSLMQSRGYPSGPQKLIiS 

RENL*CNQEGTHLGHRNCCR 
28921 - GACATAGCGAGTGACTGCACCTCCATTGAGCTCACGAGTGAGTTCACGGAGTGCACCACT - 28980 
-DIASDCTSIELTSEFTECTT 
-T*RVTAPPLSSRVSSRSAPL 
HSE*LHLH*AHE*VHGVHHC 
28981 - GCCATGCTTAGTGTTCCAGTTTTGTTCATAATCTTCAATGGGATCAGTGCCAAGCTCGTC - 2 9040 
-AMLSVPVLFI IFttGISAKLV 
-PCLVFQFCS*SSMGSVPSSS 
HA* CSSFVHNLQWDQCQARH 
29041 - ACCTAAGTCATAAGACTTTAGATCGATGCCATAGCTATGAGCACCGGCTCCCTTATTACC - 29100 
-T * V I R L * IDAIAMTTGSLIT 
-PKS*DFRSMP*L*PPAPLLP 
LSHKTLDRCHSY DHRLPYYR 
2 9101 - GTTCTTACGAAGAAGAACATTGCGGTATGCAATTGGGGTTTCGCCCACATGTGGCACGAG - 2 9160 
-V LT KKN IAVCNW GFAHMWHE 
-FLRRRT LRYAIGVSPTCGTS 
SYEEEHCGMQLGFRPHVARV 
29161 - TACTCCCAGTGTTATACCGCTACGACCGTACTGAATGCCGTCCATTTCTGCAACCAGCTC - 29220 

- Y SQCY TATTVLNAVHFCNQL 
-TPSVIPLRPY^MPSISATSS 

LPVLYRYDRTECRPFLQPAQ 
29221 - AACGACCTTGTGGCCGTGATTGGTGCTTAAGGCATCAGAACGTTTAATGAACACATAGGG - 29280 
-N DLVAVIGA* GIRTFNEH IG 
-TTLWP*LVLKASERLMNT*G 
RPCGRDWCLRRQNV* * T H R A 
29281 - CTGTTCAAGCTGGGGCAGTACGCCTTTTTCCAGCTCTACTAGACCACAAGTGCCATTTTT - 29340 
-LFKLGQYAFFQLY*" TTSAIF 
-CSSW GSTPFSSSTRPQVPFL 
VQAGAVRLFPALLDHKCHF* 
29341 - GAGGTGTTCACGTGCCTCCGATAGGGCCTCTTCCACAGAGTCCCCGAAGCCACGCACTAG - 29400 
-EVFTCLR* GLFHRVPEATH* 
-RCSRASDRASSTES PKPRTS 
GVHVPPIGPLPQSPRSHALA 
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29401 " CACG1CTCTAACCTGAAGGACAGGCAAACTGAGTTGGACGTGTGTTTTCTCGTTGACACC - 29460 
-HVSNLKDRQTELDVCFLVDT 

- T S L T * RTGKLSWTCVFSLTP 

R L * P E G Q A N * VGRVFSR*HQ 
29461 - AAGAACAAGGCTCTCCATCTTACCTTTCGGTCACACCCGGACGAAACCTAGGTATGCTGA - 29520 
-KNKALHLTFRSHPDET*VC* 
-RTRLS1LPFGHTRTKPRYAD 
EQGSPSYLSVTPGRNLGMLM 
29521 - TGATCGACTGCAACACGGACGAAACCGTAAGCAGTCTGCAGAAGAGGGACGAGTTACTCG - 29530 

- * STATRTKP*AVCRRGTSYS 

- DRLQHGRNRKQSABEGRVTR 

IDCNTDETVSSLQKRDELLV 
29581 - TTTCTTGTCAACGACAGTAAAATTTATTATTGTTTATACTGCGTAGGTGCACTAGGCATG - 29640 
-FLVNDSKIYYCLYCVGALGM 
-FLSTTVKFIIVYTA*VH*AC 
SCQRQ*NLLLFILRRCTRHA 
29641 - CAGCCGAGCGACAGCTACACAGATTTTAAAGTTCGTTTAGAGAACAGATCTACAAGAGAT - 29700 
-QPS DSYTDFKVRLENRSTRD 
-SRATATQILKFV* RTDLQEI 
AERQLHRF*SSFREQIYKRS 
29701 - CGAGGTTGGTTGGCTTTTCCTGGGTAGGTAAAAACCTAATAT - 29742 
-RGWLAFPG*VKT*YX 
~EVGWLFLGR*KPNX 
RLVGFSWVGKNLIX 
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